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Abstract 


Coronaviruses are animal and human pathogens that can cause lethal zoonotic infec- 
tions like SARS and MERS. They have polycistronic plus-stranded RNA genomes and 
belong to the order Nidovirales, a diverse group of viruses for which common ancestry 
was inferred from the common principles underlying their genome organization and 
expression, and from the conservation of an array of core replicase domains, including 
key RNA-synthesizing enzymes. Coronavirus genomes (~26-32 kilobases) are the larg- 
est RNA genomes known to date and their expansion was likely enabled by acquiring 
enzyme functions that counter the commonly high error frequency of viral RNA poly- 
merases. The primary functions that direct coronavirus RNA synthesis and processing 
reside in nonstructural protein (nsp) 7 to nsp16, which are cleavage products of two 
large replicase polyproteins translated from the coronavirus genome. Significant pro- 
gress has now been made regarding their structural and functional characterization, 
stimulated by technical advances like improved methods for bioinformatics and 
structural biology, in vitro enzyme characterization, and site-directed mutagenesis of 
coronavirus genomes. Coronavirus replicase functions include more or less universal 
activities of plus-stranded RNA viruses, like an RNA polymerase (nsp12) and helicase 
(nsp13), but also a number of rare or even unique domains involved in mRNA capping 
(nsp14, nsp16) and fidelity control (nsp14). Several smaller subunits (nsp7—nsp10) act as 
crucial cofactors of these enzymes and contribute to the emerging “nsp interactome.” 
Understanding the structure, function, and interactions of the RNA-synthesizing 
machinery of coronaviruses will be key to rationalizing their evolutionary success 
and the development of improved control strategies. 


1. INTRODUCTION 


Coronaviruses (CoVs) are the best-known and best-studied clade of 
the order Nidovirales, which is comprised of enveloped plus-stranded 
(+RNA) viruses and currently also comprises the Arteriviridae, Roniviridae, 
and Mesoniviridae families (de Groot et al., 2012a,b; Lauber et al., 2012). 
In addition to including various highly pathogenic CoVs of livestock 
(Saif, 2004) and four “established” human CoVs causing a large number 
of common colds (Pyrc et al., 2007), CoVs have attracted abundant attention 
due to their potential to cause lethal zoonotic infections (Graham et al., 
2013). This was exemplified by the 2003 outbreak of severe acute respiratory 
syndrome-coronavirus (SARS-CoV) in Southeast Asia and the ongoing 
transmission—since 2012—of the Middle East respiratory syndrome- 
coronavirus (MERS-CoV), which causes ~35% mortality among patients 
seeking medical attention. Both these viruses are closely related to CoVs that 
are circulating in bats (Ge et al., 2013; Menachery et al., 2015) and other 
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potential reservoir species. They may be transmitted to humans either 
directly or through intermediate hosts, like civet cats for SARS-CoV 
(Song et al., 2005) and dromedary camels for MERS-CoV (Reusken 
et al., 2013). Formally, the family Coronaviridae now includes about 30 spe- 
cies, divided into the subfamilies Torovirinae and Coronavirinae, the latter 
being further subdivided in the genera Alpha-, Beta-, Gamma-, and 
Deltacoronavirus. SARS-CoV and MERS-CoV are betacoronaviruses, and 
the same holds true for one of the best-characterized animal CoV models, 
murine hepatitis virus (MHV). This explains why the bulk of our current 
knowledge of CoV molecular biology is betacoronavirus based, even more 
so for the replicative proteins that are the central theme of this review, which 
will mainly summarize data obtained studying SARS-CoV proteins. 

Despite their unification in the same virus order, nidoviruses cover an 
unusually broad range of genome sizes, ranging from ~13-16 kilobases 
(kb) for arteriviruses, via ~20 kb for mesoniviruses, to ~26—32 kb for CoVs 
(Nga et al., 2011). Together with the genomes of roniviruses, which infect 
invertebrate hosts, CoV genomes are the largest RNA genomes known to 
date (Gorbalenya et al., 2006). The common ancestry of these extremely 
diverse virus lineages was inferred from their polycistronic genome struc- 
ture, the common principles underlying the expression of these genomes, 
and—most importantly—the conservation of an array of “core replicase 
domains,” including key enzymes required for RNA synthesis. While 
retaining this conserved genomic and proteomic blueprint, nidovirus 
genomes are thought to have expanded gradually by gene duplication and 
acquisition of novel genes (Lauber et al., 2013), most likely by RNA recom- 
bination. In addition to the high mutation rate that characterizes all RNA 
viruses, these genomic innovations appear to have enabled nidoviruses to 
explore an unprecedented evolutionary space and adapt to a wide variety 
of host organisms, including mammals, birds, reptiles, fish, crustaceans, 
and insects. Whereas the poor replication fidelity generally restricts RNA 
virus genome sizes, it has been postulated that nidovirus genome expansion 
was enabled by the acquisition of specific replicative functions that counter 
the error rate of the RNA polymerase (Deng et al., 2014; Eckerle et al., 
2010; Snider et al., 2003) (discussed in more detail later). 

As in all nidoviruses, at least two-thirds of the CoV genome capacity is 
occupied by the two large open reading frames (ORFs) that together con- 
stitute the replicase gene, ORF1a and ORF1b (Fig. 1). These ORFs overlap 
by a few dozen nucleotides and are both translated from the viral genome, 
with expression of ORF 1b requiring a -1 ribosomal frameshift to occur just 
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Fig. 1 Outline of the CoV genome organization and expression strategy, based on 
SARS-CoV. The top panel depicts the SARS-CoV genome, including various regulatory 
RNA elements, and the 5/- and 3’-coterminal nested set of subgenomic mRNAs used 
to express the genes downstream of the replicase gene. UTR, untranslated region; 
TRS, transcription-regulatory sequence. Below the RNAs, the 14 open reading frames 
in the genome are indicated, i.e., the replicase ORFs 1a and 1b, the four common 
CoV structural protein genes (S, E, M, and N) and the ORFs encoding “accessory 
proteins.” The bottom panel explains the organization and proteolytic processing of 
the ppla and pptab replicase polyproteins, the latter being produced by -1 ribosomal 
frameshifting. The nsp3 (PL°’°) and nsp5 (3CL""°) proteases and their cleavage sites are 
indicated in matching colors. The resulting 16 cleavage products (nonstructural proteins 
(nsps)) are indicated, as are the conserved replicase domains that are relevant for this 
review. Domain abbreviations and corresponding nsp numbers: PL?”?, papain-like pro- 
teinase (nsp3); 3CL?°, 3C-like proteinase (nsp5); TM, transmembrane domain (nsp3, 
nsp4, and nsp6); NiRAN, nidovirus RdRp-associated nucleotidyl transferase (nsp12); 
RdRp, RNA-dependent RNA polymerase (nsp12); ZBD, zinc-binding domain (nsp13); 
HEL1, superfamily 1 helicase (nsp13); ExoN, exoribonuclease (nsp14); N7-MT, N7-methyl 
transferase (nsp14); endoU, uridylate-specific endoribonuclease (nsp15); 2/-O-MT, 2’-O- 
methyl transferase (nsp16). 
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upstream of the ORFla termination codon (Brierley et al., 1989). The 
efficiency of this highly conserved frameshift event, which may approach 
50% in the case of CoVs (Irigoyen et al., 2016), is promoted by specific 
primary and higher-order RNA structures. As a result, in CoV-infected cells, 
the replicase subunits encoded in ORF1a are overexpressed in a fixed ratio 
relative to the proteins encoded in ORF1b. The primary translation products 
of the CoV replicase are two huge polyproteins, the ORF1la-encoded ppla 
and the C-terminally extended pp1ab frameshift product (Fig. 1). The former 
is roughly 4000-4500 amino acids long, depending on the CoV species ana- 
lyzed. The size of the ORF1b-encoded extension is more conserved (around 
2700 residues), resulting in pplab sizes in the range of 6700-7200 amino 
acids. Probably already during their synthesis, either two or three ORF1a- 
encoded proteases initiate the proteolytic cleavage of ppla and pplab to 
release (sometimes) 15 or (mostly) 16 functional nonstructural proteins (nsps; 
Fig. 1). The highly conserved nsp5 protease has a chymotrypsin-like fold (3C- 
like protease, 3CL?") (Anand et al., 2002, 2003; Gorbalenya et al., 1989) and 
is the viral “main protease” (therefore sometimes also referred to as MP"). The 
3CL?" cleaves the nsp4—nsp11 part of pp1a and the nsp4—nsp16 part of pplab 
at 7 and 11 conserved sites, respectively. These sites can be summarized with 
the P4-P2’ consensus motif (small)-X-(L/I/V/F/M)-Q | (S/A/G), where X is 
any amino acid and | represents the cleavage. The processing of three sites in 
the nsp1—nsp4 region is performed by one or two papain-like proteases (PL?"®) 
residing in the very large nsp3 subunit (Mielech et al., 2014). Whereas 
alphacoronaviruses and most betacoronaviruses (though not SARS-CoV 
and MERS-CoV) have two PL?"® domains in their nsp3, presumably the 
result of an ancient duplication event, gamma- and deltacoronaviruses have 
only a single PL’’®. The cleavage sites (LXGG| or similar) resemble the C-ter- 
minal LRGG| motif of ubiquitin, which explains why CoV PL?"® domains 
were found capable to also act as deubiquitinases (Ratia et al., 2006). This sec- 
ondary function has been implicated in the disruption of host innate immune 
signaling by removing ubiquitin from certain cellular substrates. More than 
any other CoV-encoded enzyme, the CoV 3CL?™ and PL?"® domains have 
been characterized in exquisite structural and biochemical detail, both in their 
capacity of critical regulators of nsp synthesis and as two of the primary drug 
targets for this virus family. Space limitations unfortunately prevent us from 
summarizing these studies in more detail, but a variety of excellent reviews 
is available to compensate for this omission (Baez-Santos et al., 2015; 
Hilgenfeld, 2014; Mielech et al., 2014; Steuber and Hilgenfeld, 2010). 
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Once released from pp1a and pplab, most CoVs nsps studied thus far 
assemble into a membrane-bound ribonucleoprotein complex that drives 
the synthesis of different forms of viral RNA (see later) and is sometimes 
referred to as the replication and transcription complex (RTC). While viral 
RNA production takes off, peculiar convoluted membrane structures, 
spherules tethered to zippered endoplasmic reticulum, and double- 
membrane vesicles begin to accumulate in CoV-infected cells (Gosert 
et al., 2002; Knoops et al., 2008; Maier et al., 2013). As for other +RNA 
viruses, they have been postulated to serve as scaffolds, or perhaps even suit- 
able microenvironments, for viral RNA synthesis. Nevertheless, many ques- 
tions on their biogenesis and function remain to be answered, and the exact 
location of the metabolically active RTC still has to be pinpointed “beyond 
reasonable doubt” for CoVs and other nidoviruses (Hagemeijer et al., 2012; 
Neuman et al., 2014a; van der Hoeven et al., 2016). Three ORFla-encoded 
replicase subunits containing transmembrane domains (nsp3, nsp4, and nsp6; 
Fig. 1) have been implicated in the formation of the membrane structures 
that are induced upon CoV infection and with which the RTC is thought 
to be associated (Angelini et al., 2013; Hagemeijer et al., 2014). In addition 
to actively engaging in host membrane remodeling, they may serve as mem- 
brane anchors for the RTC by binding the nsps that lack hydrophobic 
domains, like all of the ORF1b-encoded enzymes. For more details, the 
reader is referred to the numerous recent reviews of the “replication 
organelles” of CoVs and other +RNA viruses (den Boon and Ahlquist, 
2010; Hagemeier et al., 2012; Neuman et al., 2014a; Romero-Brey and 
Bartenschlager, 2016; van der Hoeven et al., 2016; Xu and Nagy, 2014). 

The common ancestry of nidovirus replicases is not only reflected in 
their conserved core replicase domains but also in the synthesis of sub- 
genomic (sg) mRNAs that are used to express the genes located downstream 
of ORF1b (Fig. 1) (Gorbalenya et al., 2006). Although some nidoviruses 
(e.g., roni- and mesoniviruses) have only a few of these genes, they are much 
more numerous in arteriviruses and CoVs, their number going up to about a 
dozen ORFs for some CoVs. In addition to the standard set of four CoV 
structural protein genes (encoding the spike (S), envelope (E), membrane 
(M), and nucleocapsid (N) protein), genomes in different CoV clusters con- 
tain varying numbers of ORFs encoding so-called “accessory proteins” (Liu 
et al., 2014; Narayanan et al., 2008). The proteins they encode are often dis- 
pensable for the basic replicative cycle in cultured cells, but highly relevant 
for CoV viability and pathogenesis in vivo, for example, because they enable 
the virus to interfere with the host’s immune response. Most of the genes 
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downstream of ORF1b are made accessible to ribosomes by positioning 
them at the 5’ end of their own sg transcript. Occasionally, two or even three 
genes are expressed from the same sg mRNA, usually by employing ribo- 
somal “leaky scanning” during translation initiation. 

Nidoviral ss mRNAs are 3/-coterminal with the viral genome, but in 
most nidovirus taxa, including CoVs, the sg transcripts also carry common 
5’ leader sequences (~65—95 nucleotides in CoVs), which are identical to 
the 5’-terminal sequence of the viral genome (Fig. 1) (Pasternak et al., 
2006; Sawicki et al., 2007; Sola et al., 2011). The joining of common 
leader and different sg RNA “body” sequences occurs during minus-strand 
RNA synthesis (Sawicki and Sawicki, 1995; Sethna et al., 1989). This 
step can be either continuous, to produce the full-length minus strand 
required for genome replication, or interrupted (discontinuous) to produce 
a subgenome-length minus-strand RNA that can subsequently serve as the 
template for the synthesis of one of the sg mRNAs. The polymerase jumping 
that is the basis for leader-to-body joining occurs at specific “transcription- 
regulatory sequences” (TRSs). These conserved sequence motifs are com- 
prised of up to a dozen nucleotides, and are found in the genome at the 3/ 
end of the leader sequence and at the 5’ end of each of the ss mRNA bodies. 
Quite likely, also higher-order RNA structure and transcription-specific 
protein factors play a role in the interruption of minus-strand RNA synthesis 
at a body TRS, after which the nascent minus strand (with a body TRS 
complement at its 3’ end) is translocated to the 5’-proximal part of the geno- 
mic template. Guided by a base-pairing interaction with the leader TRS, the 
synthesis of the subgenome-length minus-strand RNA is resumed and 
completed with the addition of the complement of the genomic leader 
sequence. In this manner, a nested set of subgenome-length templates for 
sg mRNA synthesis is produced, providing a mechanism to regulate the 
abundance of the different viral proteins by fine-tuning the level at which 
the corresponding sg mRNA is generated (Nedialkova et al., 2010). The 
CoV transcription strategy allows the RTC to use the same 3’-terminal rec- 
ognition/initiation signals in both full- and subgenome-length templates of 
either polarity. Moreover, the presence of the common 5’ leader sequence 
may be important for mRNA capping or other translation-related features. 

During the past two decades, studies on the CoV enzyme complex that 
controls this elegant replication and transcription mechanism have been 
accelerated by four important developments. First, using bioinformatics, 
expression systems, and virus-infected cells, the replicase polyprotein 
processing scheme and the proteases involved were elucidated, thus defining 
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the boundaries of the 16 mature nsps (Fig. 1) that are working together 
during CoV replication (Ziebuhr et al., 2000). Second, using this informa- 
tion and promoted by rapidly advancing methods in structural biology, 
X-ray or NMR structures were obtained for numerous (recombinant) 
full-length CoV nsps or domains thereof, in particular for SARS-CoV 
(Neuman et al., 2014b). Third, multiple techniques for the targeted muta- 
genesis of CoV genomes were developed and refined, which was a specific 
technical challenge due to the exceptionally large size of the CoV RNA 
genome (Almazan et al., 2014). By launching engineered mutant genomes 
in susceptible cells, the RNA and protein players in the CoV replication 
cycle can now be interrogated directly, to reveal their importance, 
function(s) and/or interactions in vivo. Finally, in vitro biochemical assays 
were developed for a variety of CoV replicative enzymes, including many of 
those involved in RNA synthesis and processing. For the purpose of this 
review, we have chosen to focus on these latter functions, as performed 
by the CoV nsp7 to nsp16 products (Gorbalenya et al., 2006; Nga et al., 
2011; Sevajol et al., 2014; Subissi et al., 2014a). These subunits include 
several replicative enzymes that are more or less universal among + RNA 
viruses, such as RNA polymerase (nsp12) and helicase (nsp13), but also 
a number of rare or even unique domains involved in, e.g., mRNA 
capping, cap modification, and promoting the fidelity of CoV RNA syn- 
thesis. Several smaller subunits, in particular nsp7 to nsp10, have been iden- 
tified as crucial cofactors of these enzymes and contribute to the emerging 
CoV “nsp interactome,” which will likely need to be advanced considerably 
to achieve a more complete understanding of the intricacies of CoV RNA 
synthesis. Making that step will obviously be key to understanding the 
evolutionary success of CoVs, and nidoviruses at large. Moreover, this 
knowledge will lay the foundation for the development of improved strat- 
egies to combat current and future emerging CoVs, including targeted 
antiviral drug development. 


2. CORONAVIRUS nsp7—10: SMALL BUT CRITICAL 
REGULATORY SUBUNITS? 


The 3’-terminal part of ORFla, the approximately 1.7 kb separating 
the nsp6-coding sequence and the ORF1a/1b ribosomal frameshift site, 
encodes a set of four small replicase subunits, named nsp7 to nsp10 
(Fig. 1). Although highly conserved among Coronavirinae, these proteins seem 
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to lack enzymatic functions. Instead, they have emerged as (putative) interac- 
tion partners and modulators of ORF1b-encoded core enzymes like nsp12 
(RNA-dependent RNA polymerase, RdRp), nsp14 (exoribonuclease, 
ExoN), and nsp16 (ribose 2'-O-methy]l transferase, 2'/-O-MTase). Further- 
more, several of them have been predicted or shown to interact with 
RNA. Additionally, a fifth, very small cleavage product is assumed to be 
released from this region of pp1a: the nsp11 peptide resulting from cleavage 
of ppta at the nsp10/11 junction (Fig. 1). In the pp1ab frameshift product, the 
N-terminal sequence of nsp11 (encoded between the nsp10/11 junction and 
ORF la/1b frameshift site) equals the N-terminal part of the nsp12 subunit. 
Depending on the CoV species, nsp11 consists of 13—23 residues and its actual 
release, function (if any), or fate in CoV-infected cells have not been 
established. In cell culture models, for some (infectious bronchitis virus 
(IBV)) but not other (MHV) CoVs, the nsp10/11 and nsp10/12 cleavages 
were found to be dispensable for virus replication (Deming et al., 2007; 
Fang et al., 2008), even though the conservation of this cleavage site suggests 
that it is generally required for full replicase functionality. 

Processing of the nsp7—nsp10 region of pp1a/pp1ab has been studied in 
some detail for MHV (Bost et al., 2000; Deming et al., 2007), human CoV 
229E (HCoV-229E) (Ziebuhr and Siddell, 1999), and IBV (Ng et al., 
2001), confirming the release of these subunits in infected cells and the 
use of the predicted 3CL’™ cleavage sites. Processing at these sites was 
found to be critical for MHV replication, the exception being inactivation 
of the nsp9/10 cleavage site, which yielded a crippled mutant virus. 
Depending on antibody availability, the subcellular localization of nsp7 
to nsp10 has been studied for several CoVs using immunofluorescence 
microscopy. Without exception, and in line with their role as interaction 
partner of key replicative enzymes, these subunits localize to the 
perinuclear region of infected cells (Bost et al., 2000), where the membra- 
nous replication organelles of CoVs accumulate (Gosert et al., 2002; 
Knoops et al., 2008; Maier et al., 2013). It should be noted, however, that 
these labeling techniques cannot distinguish between fully processed nsps 
and polyprotein precursors or processing intermediates. 


2.1 Coronavirus nsp7 


The structure of the 83-amino acid SARS-CoV nsp7 was determined using 
both NMR (Peti et al., 2005) and X-ray crystallography (Zhai et al., 2005), 
with the latter study resolving the structure of a hexadecameric 
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Fig. 2 Crystal structure of the SARS-CoV nsp7—nsp8 hexadecamer (pdb 2AHM) (Zhai 
et al., 2005). Purified recombinant SARS-CoV nsp7 and nsp8 were found to self-assemble 
into a supercomplex of which the structure was determined at 2.4 A resolution. (A) The 
complex forms a doughnut-shaped hollow structure of which the central channel is lined 
with positively charged side chains (in blue) and was postulated to mediate double- 
stranded RNA binding. The outside of the structure is predominantly negatively charged 
(red) surface shading). (B and C) SARS-CoV nsp8 resembles a “golf club’-like shape that 
can adopt two conformations, as presented here in orange and green. These nsp8 con- 
formations are integrated into a much larger, hexadecameric structure that is composed 
of eight nsp8 subunits and eight nsp7 subunits, of which one is shaded pink. In (B), the 
hexadecamer is depicted against the background of the surface plot presented in (A). 


supercomplex consisting of recombinant nsp7 and nsp8 (see later; Fig. 2). In 
both structures, the nsp7-fold includes four helices, but their position and 
spatial orientation is quite different, suggesting that the protein’s conforma- 
tion is strongly affected by the interaction with nsp8, in particular, where it 
concerns helix 04 (Johnson et al., 2010). Reverse-genetics studies targeting 
specific residues in SARS-CoV nsp7 confirmed the protein’s importance for 
virus replication (Subissi et al., 2014b), although the impact of single point 
mutations was smaller than anticipated on the basis of the biochemical char- 
acterization of the RNA-binding properties of nsp7-containing protein 
complexes in vitro (see later). 


2.2 Coronavirus nsp8 and nsp7-nsp8 Complexes 


The ~200-amino-acid-long nsp8 subunit initially took center stage due to 
two studies, the first describing a fascinating hexadecameric structure con- 
sisting of eight copies each of nsp7 and nsp8 (Fig. 2) (Zhai et al., 2005), and 
the second reporting an nsp8-specific “secondary” RNA _ polymerase 
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activity (Imbert et al., 2006) that was implicated in the mechanism of initi- 
ation of CoV RNA synthesis. This template-dependent activity was 
reported to depend on the presence of Mn** or Mg” and to typically gen- 
erate products of up to six nucleotides (for more details, see Section 3.2). 
Around the same time, purified recombinant SARS-CoV nsp7 and nsp8 
were found to self-assemble into the hexadecameric supercomplex of which 
the structure was determined at 2.4 A resolution (Zhai et al., 2005). The 
complex was described, and also visualized by electron microscopy, as a 
doughnut-shaped hollow structure of which the central channel is lined 
with positively charged side chains (Fig. 2A). A combination of structural 
modeling, RNA-binding studies, and site-directed mutagenesis led to the 
hypothesis that the complex may slide along the replicating viral RNA 
together with other viral proteins, possibly as a processivity factor for the 
RdRp (nsp12; see later). Within the nsp7—nsp8 hexadecamer, SARS- 
CoV nsp8 was found to adopt two different conformations (Fig. 2B and C). 
These were named “golf club” and “golf club with a bent shaft” (Zhai 
et al., 2005), with the globular head of the golf club being considered a 
new fold. Although the structures of feline coronavirus (FCoV) nsp7 and 
nsp8 were found to resemble their SARS-CoV equivalents, they were found 
to assemble into a quite different higher-order complex, with two copies of 
nsp7 and a single copy of nsp8 forming a heterotrimer (Xiao et al., 2012). 

Biochemical and reverse-genetics studies pointed toward an important 
role in RNA synthesis for SARS-CoV nsp8 residues K58, P183, and 
R190, whose replacement was lethal to SARS-CoV. Of these residues, 
P183 and R190 were postulated to be involved in interactions with nsp12, 
whereas K58 may be critical for nsp8-RNA interactions (Subissi et al., 
2014b). Reverse-genetics studies targeting the 3/-proximal RNA 
replication signals in the MHV genome provided strong evidence for an inter- 
action between nsp8 and these RNA structures (a so-called “bulged stem- 
loop” and RNA pseudoknot). When making a particular 6-nucleotide inser- 
tion in the RNA pseudoknot, which strongly affected MHV replication, mul- 
tiple suppressor mutations evolved, of which several mapped to the genomic 
region encoding nsp8 and nsp9 (Zust et al., 2008). These interactions were 
postulated to be part of a molecular switch that controls minus-strand 
RNA synthesis, or its initiation from the 3’ end of the viral genome 
(te Velthuis et al., 2012; Zust et al., 2008). Using screening approaches 
based on yeast two-hybrid and glutathione S-transferase (GST) pull-down 
assays, SARS-CoV nsp8 was reported to be an interaction partner of many 
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other viral proteins (including nsp2, nsp3, and nsp5 to nsp16), although 
most of these interactions remain to be verified in the infected cell (von 
Brunn et al., 2007). 


2.3 Coronavirus nsp9 


The CoV nsp9 subunit is about 110 amino acids long and was the second 
replicase cleavage product, after nsp5, for which crystal structures were 
obtained (Egloff et al., 2004; Sutton et al., 2004). The biologically active 
form of the protein is believed to be a dimer that is capable of binding nucleic 
acids in a nonsequence-specific manner, with an apparent preference for 
single-stranded RNA (Egloff et al., 2004; Ponnusamy et al., 2008; Sutton 
et al., 2004). Several nsp9 point mutations that block CoV replication have 
now been described (Chen et al., 2009a; Miknis et al., 2009), but the pro- 
tein’s exact function has remained enigmatic thus far. 

The nsp9 monomer consists of a B-barrel, composed of seven B-strands, 
and a C-terminal domain formed by a single o-helix. The latter domain plays 
a key role in the formation of the parallel helix-helix dimer conformation 
that—based on sequence conservation, structural considerations, and exper- 
imental data (Miknis et al., 2009)—1is thought to be the biologically most 
relevant state of SARS-CoV nsp9. Nevertheless, multiple alternative struc- 
tures were described, including a SARS-CoV form that is stabilized by 
B-sheet interactions (Sutton et al., 2004) and, for HCoV-229E nsp9, an anti- 
parallel helix—helix dimer that is stabilized by a disulfide bond (Ponnusamy 
et al., 2008). Replacement of the HCoV-229E Cys residue involved in 
dimerization (Cys-69) resulted in conversion to the parallel helix—helix 
dimer described for SARS-CoV nsp9. Whereas wild-type HCoV-229E 
nsp9 is organized as a trimer of dimers, the Cys-69 — Ala mutant and SARS- 
CoV nsp9 both form rod-like polymers (Ponnusamy et al., 2008). Disulfide 
bonding of the latter protein could not be detected (Miknis et al., 2009). 
Although SARS-CoV and other betacoronaviruses do contain an equivalent 
Cys residue, the feature is not conserved in alphacoronaviruses that are much 
more closely related to HCoV-229E. Thus, it cannot be excluded that the 
disulfide-bonded form of HCoV-229E nsp9 is an artifact of recombinant 
protein purification and crystallization, although it was suggested that oxi- 
dative stress due to viral infection may favor its formation in CoV-infected 
cells (Ponnusamy et al., 2008). We are not aware of experiments directly 
addressing the existence of such a disulfide-linked nsp9 dimer in CoV- 
infected cells. 
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The importance of nsp9 dimerization for SARS-CoV and IBV viability 
was demonstrated in reverse-genetics studies (Chen et al., 2009a; Miknis 
et al., 2009) that also independently confirmed the importance of dimeriza- 
tion of the a-helical domain and in particular a putative GxxxG protein— 
protein interaction motif. Although RNA binding in vitro was not disrupted 
in dimerization-incompetent SARS-CoV nsp9 variants, their affinity for 
ssRNA 20-mers was reduced by 5- to 12-fold compared to the wild-type 
protein (Miknis et al., 2009). Replacement of some of the basic residues 
(e.g., Lys-10, Lys-51, and Lys-90) in the B-barrel domain of IBV nsp9 also 
significantly reduced the protein’s capability to bind RNA in vitro, but these 
mutations only modestly affected virus replication upon reverse engineering 
(Chen et al., 2009a). It remains to be studied how nsp9 dimerization and 
mutagenesis may affect interactions with other replicase subunits, like 
nsp8 and nsp12-RdRp. These proteins were identified as nsp9 interaction 
partners using different technical approaches (Brockway et al., 2003; Sutton 
et al., 2004; von Brunn et al., 2007) and colocalize with nsp9 on the mem- 
branous replication organelles (Bost et al., 2000). At present, the available 
data suggest that, for efficient CoV replication, nsp9 homodimerization is 
a more critical feature than the protein’s affinity for RNA per se. Alterna- 
tively, the correct positioning of RNA on larger protein complexes con- 
sisting of (or containing) nsp9 may be important for the protein’s correct 
functioning in viral RNA synthesis (Miknis et al., 2009). Currently, the fact 
that suppressor mutations arose in MHV nsp9 (and nsp8) after mutagenesis of 
3/-proximal MHV replication signals (see earlier) is the most compelling evi- 
dence for the involvement of nsp9—RNA interactions in a critical step of 
CoV replication. The protein may be part of a molecular switch (Zust 
et al., 2008) and/or possess features that are relevant to viral pathogenesis, 
as mutations in nsp9 were found to contribute to increased SARS-CoV 
pathogenesis in an animal model employing young mice infected with a 
mouse-adapted virus strain (MA-15) (Frieman et al., 2012). 


2.4 Coronavirus nsp10 


The small nsp10 subunit (139 residues in the case of SARS-CoV) is among 
the more conserved CoV proteins and is thought to serve as an important 
multifunctional cofactor in replication. Using yeast two-hybrid assays, 
nsp10 was shown to interact with itself, as well as with nsp1, nsp7, nsp14, 
and nsp16. These interactions were confirmed by coimmunoprecipitation 
and/or GST pull-down assays (Brockway et al., 2004; Imbert et al., 
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2008; Pan et al., 2008; von Brunn et al., 2007). The important role of nsp10 
in replication was first inferred from the phenotype of temperature-sensitive 
mutants of MHV in which an nsp10 mutation was responsible for a defect in 
minus-strand RNA synthesis (Sawicki et al., 2005). In addition, the protein 
was implicated in the regulation of polyprotein processing since an 
engineered MHV nsp10 double mutant (Asp-47 and His-48 to Ala) was par- 
tially impaired in the processing of the nsp4—nsp11 region (Donaldson 
et al., 2007). 

When nsp10 was characterized in biochemical and structural studies, the 
protein was found to bind two Zn~* ions with high affinity, suggesting the 
presence of two zinc-finger motifs (Matthes et al., 2006). Additionally, in 
in vitro assays, nsp10 displayed a weak affinity for single- and double- 
stranded RNA and DNA, although no obvious sequence specificity could 
be established, suggesting that the protein may function as part of a larger 
RNA-binding complex. Crystal structures of monomeric and dodecameric 
forms of SARS-CoV nsp10 were solved by different laboratories, but 
obvious structural rearrangements between the two forms were not detected 
(Joseph et al., 2007; Su et al., 2006). The structures revealed a new fold in 
which the Zn" ions are coordinated in a unique conformation and in which 
a cluster of basic residues on the protein’s surface probably contributes to the 
RNA-binding properties of nsp10. More recent biochemical studies rev- 
ealed that nsp10 interacts with nsp14 and nsp16 and regulates their respective 
ExoN and ribose-2’-O-MTase (2'-O-MTase) activities (Bouvet et al., 2010, 
2012). Both these cofactor functions will be discussed in more detail later, in 
Section 5. 


3. CORONAVIRUS nsp12: A MULTIDOMAIN RNA 
POLYMERASE 


Although a virus-encoded RdRp is at the hub of the replication of all 
RNA viruses, special properties have long been attributed to the CoV 
RdRp. These ideas find their origin in a combination of CoV features, like 
the exceptionally long RNA genome (Gorbalenya et al., 2006), the complex 
mechanism underlying subgenomic RNA synthesis (Gorbalenya et al., 
2006; Pasternak et al., 2006; Sawicki et al., 2007; Sola et al., 2011), the 
reported high RNA recombination frequency (Graham and Baric, 2010; 
Lai and Cavanagh, 1997), and the size and positioning of the RdRp- 
containing subunit, nsp12, within the replicase polyprotein. It remains to 
be elucidated to which extent features like polymerase processivity, fidelity, 
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and template switching (during either genomic recombination or 
subgenome-length negative-strand RNA synthesis) are determined by the 
properties of the nsp12-RdRp subunit itself or by some of its protein cofac- 
tors, such as nsp7 and nsp8 (see earlier). In fact, some cofactors have been 
studied more extensively than nsp12 itself, and the same holds true for some 
of the specific RNA signals employed by the RdRp during, e.g., replication 
and subgenomic mRNA synthesis. Protein subunits of the larger RNA- 
synthesizing complex, like nsp7—nsp8, the nsp13-helicase, and the nsp14- 
ExoN, likely exert a strong influence on RdRp behavior and performance. 
On the other hand, a recent study employing homology modeling and 
reverse genetics of the MHV RdRp domain described the first two 
nsp12 mutations that can induce resistance to a mutagen and reduce the 
MHV RdRp error rate during virus passaging (Sexton et al., 2016). So, 
not unexpectedly, also features within nsp12 itself contribute to properties 
like nucleotide selectivity and fidelity regulation. All of the currently 
identified nsp12 cofactors, and most other CoV nsps, assemble into 
membrane-associated enzyme complexes (see earlier). The large number 
of viral subunits in these complexes (Subissi et al., 2014a), the likely require- 
ment for host factors (van Hemert et al., 2008), and the concept of RNA 
synthesis occurring in a dedicated microenvironment in the infected cell 
(Knoops et al., 2008; V’Kovski et al., 2015) complicate the straightforward 
characterization of the CoV RdRp. To reconstitute the enzyme’s activities 
in vitro, purified recombinant nsp12 is a key reagent but, for many years, 
such studies were hampered by poor nsp12 expression in Escherichia coli. 
The first in vitro activity assays have only been developed recently 
(Subissi et al., 2014b; te Velthuis et al., 2010), and the same technical issues 
with protein production explain the current lack of an nsp12 crystal struc- 
ture. Consequently, structural information is restricted to sequence com- 
parisons and some homology-based structure models of the C-terminal 
RdRp domain of the ~930-residue-long nsp12 (Xu et al., 2003). More- 
over, most of what we have learned so far is based on the characterization 
of a single nsp12 homolog only, that of the SARS-CoV. 


3.1 The nsp12 RdRp Domain 


The nsp12-coding sequence includes the ORF1a/1b ribosomal frameshift 
site and a programmed -1 frameshifting event directs ORF1b translation 
to yield the pplab polyprotein that includes nsp12. The 3CLP"°-driven 
cleavage required to release the N-terminus of nsp12 is the same that 
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separates nsp10 and nsp11. About 925—940 amino acids downstream (932 in 
the case of SARS-CoV), the nsp12/nsp13 cleavage site separates the CoV 
RdRp subunit from the helicase-containing cleavage product, which— 
uniquely among +R NA viruses—tresides downstream of the RdRp domain 
for reasons that are poorly understood thus far (Gorbalenya et al., 2006). 

Nsp12 consists of at least two domains, the recently described N-termi- 
nal “nidovirus-wide conserved domain with nucleotidyl transferase activity” 
(nidovirus RdRp-associated nucleotidyltransferase (NiRAN); see later) 
(Lehmann et al., 2015a) and the C-terminal canonical RdRp domain 
(Gorbalenya et al., 1989). The latter possesses the common motifs and struc- 
tural features found in other RNA polymerases, which are often summarized 
as a “cupped right hand” with subdomains called fingers, palm, and thumb 
each playing specific roles in binding of templates and NTPs, initiation, and 
elongation (te Velthuis, 2014; Xu et al., 2003). In simplified form, the reac- 
tion catalyzed by the RdRp comes down to selecting the appropriate NTP 
to match with the template and the formation of a phosphodiester bond to 
extend the 3’ end of the nascent RNA chain with this incoming nucleotide 
(Ng et al., 2008; van Dik et al., 2004). Reconstituting these activities 
in vitro using a purified RdRp preparation can be relatively straightforward, 
but sometimes is a huge technical challenge depending—among other 
factors—on the efficiency of recombinant RdRp expression and purifica- 
tion, the existence of specific template requirements (e.g., recognition sig- 
nals), and the need for protein cofactors. 


3.2 The Initiation Mechanism of the nsp12 RdRp 


The initiation mechanism of the CoV RdRp, primer dependent or de novo, 
continues to be a much-debated issue, with important implications for the 
question of how CoVs maintain the integrity of the crucial terminal 
sequences of their genome. Compared to a de novo-initiating RdRp, the 
enzyme’s active site, which is enclosed by the thumb and fingers domains, 
needs to be more accessible when a primer-template duplex has to be 
accommodated. De novo initiation, on the other hand, requires specific 
structural elements (so-called “priming loops”) that serve to properly posi- 
tion the initiating NTPs for catalysis, thus creating an initiation platform for 
RNA synthesis. Bioinformatics analyses grouped the CoV RdRp with 
primer-dependent RdRps, as found in, e.g., picornaviruses and caliciviruses, 
in part based on the identification of a specific sequence motif (motif G) that 
is thought to mediate primer recognition (Fig. 3A) (Beerens et al., 2007; 
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Fig. 3 Comparison of coronavirus nsp12 and arterivirus nsp9, containing the highly 
conserved NiRAN and RdRp domains. (A) Similarity density plot derived from a multiple 
sequence alignment including RdRp subunits from all nidovirus lineages. To highlight 
local deviations from the average, areas displaying conservation above and below the 
mean similarity are shaded in black and gray, respectively. Conserved sequence motifs 
of NiRAN (subscript N; see also B) and RdRp (subscript R) are labeled. Domain boundaries 
used for bioinformatics analyses and uncertainty with respect to the NiRAN/RdRp 
domain boundary are indicated with vertical and by dashed horizontal lines, respectively. 
Below each plot, the predicted secondary structure elements are presented in gray for 
a-helices and black for B-strands. (B) Multiple sequence alignment showing the three 
conserved motifs of the NiRAN domain from representative species across the 
Nidovirales order. Conserved residues in this alignment are shown in white font, while 
partially conserved residues are boxed. The bottom line depicts residues also conserved 
in the arterivirus EAV, which was used for a first experimental analysis of the NiRAN 
domain (Lehmann et al., 2015a). Abbreviations not explained in the main text: NHCoV, 
night-heron coronavirus HKU19 (genus Deltacoronavirus); BToV, bovine torovirus 
(family Coronaviridae, subfamily Torovirinae, genus Torovirus); WBV, white bream virus 
(family Coronaviridae, subfamily Torovirinae, genus Bafinivirus); YHV, yellow head 
virus (family Roniviridae, genus Okavirus); CavvV, Cavally virus (family Mesoniviridae, genus 
Alphamesonivirus). (A) Modified with permission from Lehmann, K.C.,, Gulyaeva, A,, 
Zevenhoven-Dobbe, J.C., Janssen, G.M., Ruben, M., Overkleeft, H.S. et al, 2015. Discovery 
of an essential nucleotidylating activity associated with a newly delineated conserved 
domain in the RNA polymerase-containing protein of all nidoviruses. Nucleic Acids Res., 
43, 8416-8434. 
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Gorbalenya et al., 2002; Xu et al., 2003). This prediction appeared to be 
further supported by the identification of SARS-CoV nsp8 as a de novo- 
initiating second RNA polymerase (see earlier), capable of synthesizing 
products of up to six nucleotides in length that could serve to prime 
RNA synthesis by the nsp12-RdRp (Imbert et al., 2006). Support for a 
direct interaction between nsp8 and nsp12 was obtained using different 
technical approaches (Imbert et al., 2008; Subissi et al., 2014b; von 
Brunn et al., 2007). However, although a similar primer-independent 
RdRp activity was reported for the FCoV nsp8 (Xiao et al., 2012), other 
studies have called into question this concept of a primase—main RdRp 
(i.e., nsp8/nsp12) tandem working in concert to achieve initiation of 
processive CoV RNA synthesis (see later). 

Using recombinant SARS-CoV nsp12, preliminary evidence for 
primer-dependent RdRp activity on poly(A) templates was first obtained 
using a GST—nsp12 fusion protein, although these efforts were hampered 
by protein instability, which also led to the conclusion that the N-terminal 
domain of nsp12 is required for activity (Cheng et al., 2005). Subsequently, a 
C-terminally Hisg-tagged SARS-CoV nsp12 was found to mediate homo- 
polymeric RNA synthesis in a primer-dependent manner (te Velthuis et al., 
2010). Both these activities must probably be considered relatively weak and 
nonprocessive compared to the activity observed when a SARS-CoV nsp12 
RdRp assay was supplemented with nsp7 and nsp8 (Subissi et al., 2014b). 
However, at the same time, this study reinvigorated the debate on the ini- 
tiation mechanism of the coronavirus RdRp, as the nsp7—8-12 tripartite 
complex displayed both primer-dependent and de novo initiation of 
RNA synthesis, whereas no de novo-initiating RdRp activity could be 
detected for nsp8 or the nsp7—nsp8 complex alone (Subissi et al., 2014b). 
To add to the confusion, other studies reported de novo initiation by SARS- 
CoV nsp12 alone (Ahn et al., 2012) and primer-dependent RdRp activity of 
SARS-CoV nsp8, when expressed without affinity tags commonly used to 
facilitate purification (te Velthuis et al., 2012). Technical differences 
between these studies and those summarized earlier (e.g., regarding expres- 
sion constructs and templates used) may have contributed to the contradic- 
tory results obtained on the RdRp activities of nsp8 and nsp12. Thus far, five 
different laboratories addressed the two (putative) coronavirus RdRps in 
seven independent studies, none of which succeeded in exactly reproducing 
the results of any of the other studies (Ahn et al., 2012; Cheng et al., 2005; 
Imbert et al., 2006; Subissi et al., 2014b; te Velthuis et al., 2010, 2012; Xiao 
et al., 2012). Nidovirus RdRps appear to be technically challenging and 
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sensitive proteins that may respond to minute changes in purification pro- 
tocols or assay conditions. Clearly, both the role of nsp8 (primase or 
processivity factor?) and the initiation mechanism employed by the 
nsp12-RdRp require further study. Although the bioinformatics-based pre- 
diction that nsp12 uses a primer-dependent initiation mechanism is compel- 
ling, it lacks the direct support of an nsp12 crystal structure. At the same 
time, the question of the nature and source of the primer that would be used 
by nsp12 seems to be wide open again. 


3.3 Inhibitors of the nsp12 RdRp 


As for other RNA viruses, the nsp12-RdRp of CoVs is a primary drug target 
that may, in principle, be inhibited without major toxic side effects for the 
host cell. Nucleoside analogs constitute an important class of antiviral drug 
candidates that can target viral RdRps, but efforts to use them to inhibit 
CoV replication were not very successful thus far (Chu et al., 2006; 
Ikejiri et al., 2007). Moreover, it remains to be established that their target 
in the infected cell is indeed the nsp12-RdRp. The mismatch repair capa- 
bilities attributed to the nsp14-ExoN domain (see later) (Bouvet et al., 2012) 
may pose an additional hurdle, as the efficacy of a nucleoside analogue with 
anticoronavirus activity may be determined by the balance between its pro- 
pensity to be incorporated by the nsp12-RdRp and its tendency to resist 
excision by the mismatch repair mechanism mediated by nsp14-ExoN. 
Similar considerations apply to ribavirin, a guanosine analog with broad- 
spectrum antiviral activity that is used to treat patients infected with a variety 
of RNA viruses. Its mechanism of action appears to differ on a case-by-case 
basis, but may include the induction of lethal mutagenesis by increasing the 
RdRp error rate, inhibition of viral mRNA capping, and reduction of viral 
RNA synthesis by inhibition of the cellular enzyme inosine monophosphate 
dehydrogenase (IMPDH), which decreases the availability of intracellular 
GTP (Crotty et al., 2000, 2002; Smith et al., 2013, 2014). Although ribavirin 
was used to treat small numbers of SARS and MERS patients, high doses 
were used and the benefits of the treatment remained essentially unclear 
(Zumla et al., 2016). Experiments with different CoVs in animal models 
(Barnard et al., 2006; Falzarano et al., 2013) and infected cell cultures 
(Ikejiri et al., 2007; Pyrc et al., 2006) also established its poor activity and 
strongly suggested that ribavirin does not target the CoV RdRp directly 
or is targeted (itself) by the nsp14-ExoN activity (Smith et al., 2013). Inno- 
vative nucleoside inhibitors continue to be identified or developed (Peters 
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et al., 2015; Warren et al., 2014) and the recently described in vitro RdRp 
assay (Subissi et al., 2014b) may prove very useful for establishing their 
mechanism of inhibition more precisely. A better understanding of 
nsp12-RdRp structure and function will also be required to design strategies 
that minimize the impact of drug resistance-inducing mutations, which are a 
common problem when targeting enzymes of rapidly evolving RNA 
viruses. 


3.4 The nsp12 NiRAN Domain 


Since the delineation of the borders of the CoV RdRp-containing replicase 
cleavage product (Boursnell et al., 1987; Gorbalenya et al., 1989), which is 
now known as nsp12, it had been clear that the protein must be a multi- 
domain subunit, with the canonical RdRp domain roughly occupying its 
C-terminal half (Fig. 3A). Only recently, first clues to some of the properties 
and possible functions of the N-terminal part of nsp12 were obtained 
(Lehmann et al., 2015a). A renewed bioinformatics analysis across the (still 
expanding) order Nidovirales revealed that the nidoviral RdRp-containing 
replicase subunit contains a conserved N-terminal domain of 200-300 res- 
idues (~225 residues in CoV nsp12; Fig. 3B). In CoV nsp12, about 175 
residues separate the NiR AN and RdRp domains, leaving space for the 
presence of an additional domain between the two. 

Based mainly on biochemical data obtained with the arterivirus homolog 
(see later), the N-terminal domain was concluded to possess an essential 
nucleotidylation activity and hence it was coined nidovirus RdRp-associated 
nucleotidyltransferase (NiR AN) (Lehmann et al., 2015a). NiR AN conserva- 
tion was found to be lower than that of the downstream RdRp domain 
(Fig. 3A), but the analysis suggested that the evolutionary constraints on 
NiRAN have been similar in different nidovirus lineages, which would be 
in line with a conserved function. Gorbalenya and colleagues identified three 
key NiRAN motifs (A-B-C) containing seven invariant residues (Fig. 3B), 
with domains B and C being most conserved (Lehmann et al., 2015a). The 
identification of the NiRAN domain was further supported by the conserva- 
tion of its predicted secondary structure elements in different nidovirus fam- 
ilies (Fig. 3A). Extensive database searches did not reveal potential NiRAN 
homologs in either the viral or the cellular world, although it cannot be 
excluded that the domain has diverged from cellular ancestors to a level that 
prevents their identification with the currently available sequences and tools. 
Nevertheless, its unique presence in nidoviruses and its association with the 
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important RdRp domain suggest that NiR AN may be a crucial regulator or 
interaction partner of the downstream RdRp domain that must have been 
acquired before the currently known nidovirus lineages diverged. NiR AN 
and the zinc-binding domain (ZBD) that is associated with the nsp13-helicase 
protein (see later) are the only unique genetic markers of the order Nidovirales 
identified thus far. 

Mainly due to the lack of sufficient amounts of recombinant CoV nsp12, 
the preliminary biochemical characterization of NiRAN was restricted to its 
arterivirus homolog, using recombinant nsp9 of equine arteritis virus (EAV) 
(Lehmann et al., 2015a). For both EAV and SARS-CoV, it could be shown 
that replacement of conserved NiRAN residues can cripple or completely 
block virus replication in cultured cells. A combination of biochemical assays 
revealed that in vitro the NiR AN domain exhibits a specific, Mn~*-dependent 
enzymatic activity that results in the self-nucleotidylation of EAV nsp9. The 
activity was abolished upon mutagenesis of conserved key residues in 
NiRAN motifs A, B, and C. Although UTP was found to be the preferred 
substrate for NiRAN’s in vitro nucleotidylation activity, also GTP could 
be used, albeit less efficiently. The conserved lysine residue in motif 
A (the EAV equivalent of Lys-73 in SARS-CoV nsp12) was concluded 
to be the most likely target residue for nucleotidylation via formation of a 
phosphoamide bond. 

Although the importance of the NiRAN domains of arterivirus nsp9 and 
coronavirus nps12 was supported by the outcome of reverse-genetics studies 
(Lehmann et al., 2015a), the role of the produced protein—nucleoside 
adducts in viral replication remains unclear at present. In fact, the unique 
dual specificity for UTP and GTP seems to argue against two initially con- 
sidered potential NiRAN functions (Lehmann et al., 2015a). The first of 
these was a role as an RNA ligase, a type of activity however that commonly 
is ATP dependent. The second was its involvement in synthesizing mR NA 
cap structures. One of the four enzymes required for this process, the crucial 
guanylyl transferase (GTase), still remains to be identified for CoVs (see 
later). However, NiRAN’s substrate preference for UTP over GTP is 
difficult to reconcile with this hypothesis and has not been observed for 
other GTases involved in mRNA capping. The third hypothesis that was 
put forward links back to the open question of the initiation of coronavirus 
RNA synthesis, which presumably is a primer-dependent step (see earlier). 
Nsp12 nucleotidylation could be envisioned to play a role in protein-primed 
RNA synthesis, a strategy used by, e.g., picornaviruses and their relatives, 
which covalently attach an oligonucleotide to a viral protein (called VPg 
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in the case of picornaviruses) that subsequently mediates the initiation of 
RNA synthesis (Paul et al., 2000). The first step in the synthesis of the 
“protein primer” is a nucleotidylation step during which a nucleotide 
monophosphate is covalently attached to the VPg. NiRAN could be 
involved in a similar mechanism either directly or indirectly, by transferring 
the bound nucleotide to another protein player. Although such a mechanism 
would definitely revolutionize the concept of the initiation of CoV RNA 
synthesis, it is clearly not very compatible with some of the currently avail- 
able data, such as the reported presence of a 5’ cap structure (rather than a 
VPg-like molecule) on CoV mRNAs. Evidently, the further in-depth char- 
acterization of NiR AN is needed to fill the current knowledge gaps, starting 
with the biochemical characterization of a CoV NiR AN domain, which 
may confirm and extend the features now deduced from the analysis of 
its distantly related arterivirus homolog. 


4. CORONAVIRUS nsp13: A MULTIFUNCTIONAL AND 
HIGHLY CONSERVED HELICASE SUBUNIT 


Helicases are versatile NT P-dependent motor proteins that play a role 
in cellular nucleic acid metabolism in the broadest possible sense, including 
processes like DNA replication, recombination and repair, transcription, 
translation, as well as RNA processing. Helicases are also encoded by all 
+RNA viruses with a genome size exceeding 7 kb, suggesting they are 
required for the efficient replication of +R NA viral genomes above this size 
threshold. Given the large size of the genomes of CoVs and related 
nidoviruses, they may depend on the function(s) of a replicative helicase 
even more than other +RNA virus taxa. However, despite their abundance 
and conservation, the specific role of helicases in +RNA virus replication 
remains poorly understood. For an extensive recent review of nidovirus 
helicases, the reader is referred to Lehmann et al. (2015c). 

Currently, helicases are classified into six superfamilies (SFs) (Singleton 
et al., 2007), with +R NA viral helicases belonging to SF1 (e.g., alphaviruses 
and nidoviruses), SF2 (e.g., flaviviruses), or SF3 (e.g., picornaviruses). The 
presence ofa SF1 helicase (HEL1) domain in the CoV replicase polyprotein 
was discovered upon the early in-depth analysis of the first full-length CoV 
genome sequence that became available (IBV) (Gorbalenya et al., 1989). 
The HEL1 domain maps to the C-terminal part of the replicase cleavage 
product that is now known as nsp13, which is about 600 residues long. 
The CoV HEL1 domain contains all characteristic sequence motifs of the 
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SF1 superfamily. The N-terminal part of nsp13 is formed by a multinuclear 
ZBD, one of the most conserved domains across the order Nidovirales 
(Gorbalenya, 2001; Nga et al., 2011). This qualification also applies to 
the helicase-containing subunit as a whole, despite considerable size differ- 
ences between, e.g., CoV nsp13 and its arterivirus homolog (designated 
nsp10) (Lehmann et al., 2015c). The ZBD and HEL1 domains occupy a 
conserved position downstream of the RdRp domain in all nidovirus rep- 
licase polyproteins studied so far. 


4.1 The Coronavirus nsp13 SF1 Helicase (HEL1) 


SF1 helicases contain at least a dozen conserved motifs that direct the bind- 
ing of NTPs and nucleic acids. Of these, motifs I and II (also known as the 
Walker A and B boxes) are common to helicases of all SFs as well as 
NTPases. Structurally, the catalytic core of SF1 helicases like the CoV 
HEL1 domain is formed by two RecA-like domains, designated 1A and 
2A (Fig. 4), that bind to nucleic acids through stacking interactions of aro- 
matic residues with the bases of their nucleic acid substrates (Velankar et al., 
1999). Cyclic conformational changes of the RecA-like domains mediate 
the conversion of the energy from hydrolysis of the phosphodiester bonds 
of NTPs into directional movement along the nucleic acid substrate, with 
the so-called “inchworm” model now widely being considered as best 
supported by the available experimental data (Lehmann et al., 2015c; 
Velankar et al., 1999; Yarranton and Gefter, 1979). Additional domains, 
located up- or downstream of 1A and 2A, or inserted internally, can mediate 
supplemental protein-protein and protein—nucleic acid interactions or 
enzymatic activities, thus contributing to the functional versatility and spec- 
ificity of the enzyme (Lehmann et al., 2015c; Singleton et al., 2007). 
Within helicase SF1, the CoV HEL1 domain belongs to the Upf1-like 
family (SF1B) which is characterized by moving in the 5/-to-3’ direction 
along the nucleic acid strand to which they bind. Upfl-like helicases may 
unwind either DNA or RNA and, in some cases, also both substrates with- 
out a clear preference, as was readily observed during the in vitro character- 
ization of different nidovirus helicases. The CoV HEL1 activity was first 
demonstrated in vitro using recombinant HCoV-229E nsp13 (Seybert 
et al., 2000a). Bacterially expressed nsp13 from HCoV-229E and SARS- 
CoV, and also the homologous helicase (nsp10) of the arterivirus EAV, 
displayed 5’-to-3’ unwinding activity on double-stranded RNA or DNA 
substrates containing single-stranded 5’ overhangs (Ivanov and Ziebuhr, 
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Predicted 
hUpf1 (2WJV) EAV nsp10 (4NON) SARS-CoV nsp13 
ea Gare. 2 A 


Fig. 4 Three-dimensional models of cellular hUpf1 (the prototype of the Upf1-like fam- 
ily of SF1 helicases), the EAV nsp10 helicase (Deng et al., 2014), and the predicted struc- 
ture of SARS-CoV nsp13. Based on sequence and structural comparisons, nidovirus 
helicases are classified into the Upf1-like family. Domain colors in the structures corre- 
spond to those used in the domain organization depicted above each structure, in 
which domain sizes are not drawn to scale. Dashed domains represent parts that could 
not be modeled. Zn”* ions bound to the respective N-terminal domains are depicted as 
pink spheres. The identical coloring of domains other than 1A and 2A does not imply an 
evolutionary relationship. PDB accession numbers are listed in brackets. Modified with 
permission from Lehmann, K.C., Snijder, E.J., Posthuma, C.C,, Gorbalenya, A.E., 2015. What 
we know but do not understand about nidovirus helicases. Virus Res., 202, 12—32. 


2004; Ivanov et al., 2004b; Seybert et al., 2000a,b; Tanner et al., 2003). Fol- 
lowing the biochemical characterization of SARS-CoV nsp13, it was calcu- 
lated that unwinding occurs in discrete steps of 9.3 base pairs each, with a 
catalytic rate of 30 steps per second (Adedeji et al., 2012a). The nsp13 
NTPase activity can use all four natural ribonucleotides and nucleotides 
as substrate, with ATP, dATP, and GTP being hydrolyzed most efficiently, 
and UTP being the least preferred substrate (Ivanov and Ziebuhr, 2004; 
Ivanov et al., 2004b; Tanner et al., 2003). Replacement of a conserved 
Lys in motif I, the Walker A box (Walker et al., 1982), kills the in vitro 
NTPase activity of all nidovirus helicases tested thus far and, when intro- 
duced by reverse genetics, this mutation also abolished replication of the 
arterivirus EAV (Seybert et al., 2000b). 

The substrate preferences summarized earlier support a_ three- 
dimensional model of the SARS-CoV HEL! core domains (1A and 2A) that 
was based on structural information available for multiple cellular helicases 
(Hoffmann et al., 2006). The model predicts both the existence of multiple 
hydrogen bonding interactions with the B- and y-phosphates of the NTP 
and a lack of specific interactions with the nucleobase. Thus, the mere 
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presence of a 5’ triphosphate group appears to be the main determinant for 
NTP/dNTP binding. Since nidovirus helicases are presumed to unwind 
double-stranded RNA intermediates that are formed during viral replica- 
tion, considerable attention was given to the in vitro characterization of their 
nucleic acid substrate preferences (Seybert et al., 2000a,b). The HCoV- 
229E and EAV helicases could not unwind substrates with 3’ single-stranded 
tails or blunt-ended substrates. In contrast, RNA and DNA substrates with 
one or two 5’ single-stranded regions were unwound efficiently, suggesting 
that the nidovirus helicase must bind to a single-stranded region before ini- 
tiating unwinding in the 5/-to-3! direction. However, the in vitro assays did 
not yield any clear indications for the preferred recognition of specific 
sequences or higher-order structures in the substrate (Lehmann et al., 
2015c). Also a more in-depth biochemical characterization, performed with 
SARS-CoV nsp13, confirmed that the CoV helicase does not discriminate 
between RNA and DNA substrates (Adedeji et al., 2012a). Consequently, it 
cannot be excluded that the enzyme, in addition to being engaged in viral 
RNA synthesis, may also target host DNA. Nuclear translocation of 
nidovirus helicases has not been reported thus far, but the light microscopy 
techniques used to study the protein’s subcellular distribution would not suf- 
fice to detect the nuclear import of only a small fraction of the protein. 

As a final caveat it should be stressed that the biochemical properties sum- 
marized earlier are all derived from in vitro studies using recombinant heli- 
cases, expressed in different systems and sometimes containing substantial 
foreign sequences. The in situ characterization of the helicase as one of 
the key enzymes of the nidovirus RNA-synthesizing machinery remains 
to be addressed. In that context, sequence specificity, for example, could 
be conveyed by other subunits of the replicase complex, which may target 
the helicase to, e.g., the initiation sites for viral genome or antigenome syn- 
thesis, or to signals controlling the production of subgenomic mRNAs. As 
summarized by Lehmann et al. (2015c), other important helicase features 
that could be dramatically different in the setting of the infected cell are 
(the need for) helicase oligomerization, cooperativity between multiple 
helicase molecules binding to the same substrate, and—consequently— 
the overall processivity of the enzyme, which in vitro appeared to be quite 
low given the large CoV genome size (Adedeji et al., 2012a). 


4.2 The Helicase-Associated ZBD 


The nidovirus helicase subunit domain is unique among its +R NA virus 
homologs in having a conserved N-terminal domain of 80-100 residues that 
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contains 12 or 13 conserved Cys/His residues (den Boon et al., 1991; van 
Dinten et al., 2000). The domain was recognized as a potential ZBD 
(Gorbalenya et al., 1989) and early in vitro studies with the recombinant 
HCoV-229E and EAV helicases confirmed that Zn** ions are essential 
for retaining the protein’s enzymatic activities, suggesting that ZBD mod- 
ulates nidovirus helicase function (Seybert et al., 2005). A recent structural 
study of the arterivirus nsp10-helicase (Deng et al., 2014) will be discussed in 
more detail later. This first nidovirus helicase structure confirmed the bind- 
ing of three zinc ions by the ZBD, which adopts a unique fold that combines 
a RING-like module with a so-called “treble-clef’ zinc finger. 

ZBD and HEL! interact extensively (Deng et al., 2014) but, in the helicase 
primary structure, they are separated by a variable and uncharacterized domain 
that essentially explains the size difference of about 130 residues between CoV 
and arterivirus helicase subunits (Seybert et al., 2005). Using the arterivirus 
prototype EAV, the functional importance of ZBD was probed extensively 
by combining biochemistry and reverse genetics (Seybert et al., 2005; van 
Dinten et al., 2000). This yielded a variety of phenotypes for nsp10-ZBD 
mutants, the most striking being mutants deficient in subgenomic mRNA 
synthesis while remaining capable of (and even enhancing) viral genome 
replication (van Dinten et al., 1997, 2000) (see later). Most replacements of 
conserved ZBD Cys and His residues profoundly impacted the helicase 
activity of EAV nsp10, even when performed in a semiconservative manner 
that could preserve zinc binding. In reverse-genetics studies, most of these 
ZBD mutations rendered the virus nonviable. Recently, the impact of 
these mutations on ZBD integrity and ZBD-HEL1 interactions could be 
rationalized with the help of the nsp10 crystal structure (Deng et al., 2014). 


4.3 Nidovirus Helicase Structural Biology 


Despite its importance as a potential drug target, a CoV nsp13 or HEL1 crys- 
tal structure has not been obtained thus far due to technical complications 
with recombinant protein production and crystallization. Instead, several 
CoV helicase models have been described, mainly based on cellular helicase 
structures (Bernini et al., 2006; Hoffmann et al., 2006; Lehmann et al., 
2015c). Given this limitation, and despite the large evolutionary distance 
between the two enzymes, it is interesting to have a closer look at the 
recently published EAV nsp10-helicase structure (Deng et al., 2014). 

The overall structure of EAV nsp10 (Fig. 4) consists of the N-terminal 
ZBD, a new domain designated 1B, the two recA-like HEL1 domains 
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(1A and 2A), anda short C-terminal domain, which is not conserved among 
nidoviruses and needed to be deleted to allow nsp10 crystallization. This 
65-amino-acid C-terminal truncation did not affect the helicase core 
domains and only modestly influenced levels of ATPase and helicase activity 
compared to the full-length protein (Deng et al., 2014). Compared to cel- 
lular representatives of the SF1 helicase superfamily, nsp10 is most similar to 
Upf1 and its close homolog Ighmbp2, which also contain an N-terminal 
ZBD. Moreover, the location and orientation of the newly discovered 
1B domain of nsp10 (residues 83-137) resembles that of the domain with 
the same name found in the Upfl-like helicase subfamily. 

The nsp10 ZBD uses 12 conserved Cys and His residues to coordinate 
three zinc ions and folds into two zinc-binding modules that are connected 
by a disordered region. The N-terminal RING-like structure of nsp10 coor- 
dinates two zinc ions and the closest similarity that was found for this module 
was with the N-terminal zinc-binding CH-domain of Upf1. Both proteins 
have a second zinc-binding module downstream (a so-called treble-clef zinc 
finger in the case of nsp10), but these are structurally different, suggesting 
that the nidoviral ZBD represents a new kind of complex zinc-binding ele- 
ment. The previous suggestion of ZBD codetermining HEL1 function was 
strengthened by the presence of an extensive interface of 1019 A? that was 
proposed to be involved in intramolecular signaling (Deng et al., 2014). 
A second crystal structure was obtained for nsp10 in complex with a partially 
double-stranded DNA substrate, revealing possible nucleic acid-binding 
clefts at the protein’s surface that are formed by the ZBD+1B and ZBD 
+1A domains. Although the exact path of the nucleic acid strands could 
not yet be determined, it became clear that the positively charged ZBD, 
and in particular its N-terminal RING-like module, must be involved in 
nucleic acid binding. In line with the biochemical data summarized earlier, 
most of the nsp10-substrate contacts identified are not base-specific and 
occur with the nucleic acid backbone. Whereas the HEL1 core domains 
were found to be quite similar in the absence or presence of bound substrate, 
a remarkable 29 degree rotation was observed for domain 1B, enlarging the 
dimensions of the nucleic acid substrate channel formed by domains 1A and 
1B, but not allowing it to accommodate a duplex substrate. Consequently, 
it was postulated that an element near the entrance of the substrate 
channel may destabilize the duplex and facilitate the entrance of one of 
the strands into the channel. Since the double-stranded region of the 
duplex could not be modeled, additional studies are needed to verify the 
existence and molecular details of this proposed unwinding mechanism 
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(Deng et al., 2014; Lehmann et al., 2015c). Likewise, direct structural 
information on CoV nsp13 is needed to be able to assess to which extent 
the structural observations made for arterivirus nsp10 can be translated to 
distantly related (and larger) nidovirus helicases (Fig. 4). In general, how- 
ever, the analysis of nsp10 provided a clear basis for a model in which the 
function of the common RecA-like core domains of nidovirus helicases 
is modulated by specific extension domains, presumably to facilitate the 
involvement of the nidovirus helicase in multiple processes in the infected 
cell (see later). 


4.4 Functional Characterization of the Nidovirus Helicase 


As outlined earlier, the biochemical characterization of purified recombi- 
nant nidovirus helicases, the functional probing of (in particular) the EAV 
nsp10-helicase using reverse genetics, the EAV nsp10 structure, and 
advanced bioinformatics analyses together have painted a picture of an 
enzyme that is involved in multiple critical steps of the viral replicative cycle. 
Space limitations do not allow an in-depth discussion of all of these (puta- 
tive) functions, which—based on EAV nsp10 studies—may include a poorly 
characterized role in virion biogenesis (van Dinten et al., 2000), not unlike 
what was uncovered for, e.g., the helicase-containing NS3 protein of 
flaviviruses (Liu et al., 2002; Ma et al., 2008). Likewise, we will not discuss 
the first reports on possible interactions with host proteins, such as the Ddx5 
helicase (for SARS-CoV nsp13; Chen et al., 2009b) and polymerase 6 (for 
IBV nsp13; Xu et al., 2011). Instead, we will focus on the most significant 
findings related to nidovirus helicase functions in RNA synthesis and 
processing, specifically (i) genome replication, (ii) transcription of sub- 
genomic mRNAs, (iii) mRNA capping, and (iv) posttranscriptional mRNA 
quality control. 

The presumed “default” function of +RNA viral helicases is to cooper- 
ate with the viral RdRp to achieve the efficient amplification of the genome. 
In this context, helicases are presumed to promote RdRp processivity by 
opening up the double-stranded RNA intermediates of viral replication, 
and possibly also by removing RNA secondary structures in single-stranded 
template strands (Kadare and Haenni, 1997). In this light, reports on molec- 
ular interactions between the CoV RdRp (nsp12) and helicase (nsp13) were 
not unexpected (Imbert et al., 2008; von Brunn et al., 2007). The same holds 
true for the observation that SARS-CoV nsp12 can stimulate the in vitro 
helicase activity of nsp13 (Adedeji et al., 2012a) and for the fact that both 
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nsp12 and nsp13 (like most other CoV replicase cleavage products) associate 
with the membranous replication organelles in nidovirus-infected cells 
(Denison et al., 1999; Ivanov et al., 2004b; Knoops et al., 2008). In spite 
of all these indications for RdRp-helicase interplay, the polarity of nidovirus 
helicase-mediated unwinding (5’-to-3’) remains a major conundrum, as it is 
Opposite to the polarities of the RdRp and many other +RNA helicases, 
which move in a 3/-to-5’ direction on the RNA strand they initially bind 
to (Seybert et al., 2000a). This strongly suggests that the two enzymes cannot 
simply operate as a tandem that moves along the same template strand while 
copying it, a consideration that also applies to the SF1B helicase employed 
by alphaviruses. This problem could be resolved if RdRp and helicase would 
move along different strands of the same RNA duplex, which might allow 
the helicase to separate the two strands and provide a single-stranded tem- 
plate to be copied by the RdRp (Lehmann et al., 2015c). Also, it is tempting 
to speculate that the helicase, by trailing along the nascent strand (following 
the RdRp at a certain and, possibly, somewhat flexible distance), provides 
(i.e., leaves behind) a single-stranded template, thus facilitating initiation and 
elongation of RNA synthesis by the next RTC. This, for example, could 
occur in cases when multiple RTCs act simultaneously/consecutively on 
the same template to produce multiple plus-strand RNAs from the same 
minus-strand template, a process that is generally thought to add to the large 
excess of plus- over minus-strand RNAs in nidovirus-infected cells. Clearly, 
significantly more work is needed to explore this possibility. 

Nidovirus sg mRNAs are each produced from their own subgenome- 
length minus-strand template (see Section 1). In the case of arteri- and cor- 
onaviruses, these derive from a process of discontinuous minus-strand RNA 
synthesis and this unique mechanism likely requires specific functional inter- 
actions between (among others) RdRp and helicase. These interactions may 
contribute to maintaining a proper balance between continuous and discon- 
tinuous minus-strand RNA synthesis, and thus between the production of 
new genomes and sg mRNAs. The serendipitous identification of an 
arterivirus nsp10-ZBD mutation (Ser-59 — Pro) that essentially inactivated 
transcription while leaving replication unaffected was an early indication for 
the involvement of the nidovirus helicase in the control of ss mRNA syn- 
thesis (van Dinten et al., 1997). Such control functions could also be related 
to the recognition of TRSs (Fig. 1), the frequency with which each of the 
TRSs is used to produce a subgenome-length minus strand, or mechanistic 
aspects of the stalling and reinitiation of RNA synthesis or the transfer of the 
nascent strand to an upstream position on the template (see earlier), which 


88 EJ. Snijder et al. 


must occur during the discontinuous step in sg RNA synthesis (Lehmann 
et al., 2015c). Recently, the EAV nsp10 Ser-59— Pro mutation, which 
selectively reduces transcription of all subgenomic mRNAs to below 1% 
of their normal levels, was reanalyzed in the context of the nsp10 crystal 
structure. As postulated when this virus mutant was first described, its phe- 
notype appears to be based on the special structural properties of the Pro res- 
idue in combination with the position of residue 59 in a “hinge” region that 
connects ZBD to the rest of the protein (Deng et al., 2014). Although res- 
idue 59, located just downstream of the second zinc-binding module of the 
ZBD, is fairly distant from the RNA-binding surface, it resides in a region 
that connects the ZBD treble-clef zinc finger to a helix that interacts with 
domains 1A and 1B and with the nucleic acid. Thus, specific mutations 
affecting the flexibility of this hinge region may drastically influence the 
long-distance signaling within nsp10, apparently preventing the RNA- 
synthesizing machinery to work in “transcription mode” and dedicating 
it exclusively to full-length minus-strand RNA synthesis and genome ampli- 
fication. Since this kind of mutations barely affected nsp10s in vitro NTPase 
and helicase activities (Seybert et al., 2005), it may well be that changed 
interactions with specific protein partners will turn out to be the key to 
explaining the transcription-negative phenotype of the corresponding virus 
mutants (Lehmann et al., 2015c). For the coronavirus IBV, a point mutation 
in a somewhat comparable position of nsp13 (Arg-132 — Pro; just down- 
stream of ZBD) was reported to cause a similar block of sg mRNA synthesis 
(Fang et al., 2007) but, thus far, this observation has not been followed up in 
more detail for IBV or confirmed for nsp13 of another CoV. 

In addition to its role in RNA synthesis, the nidovirus helicase is also 
assumed to be involved in the capping pathway of viral mRNAs by 
exhibiting an RNA 5/-triphosphatase (RTPase) activity that can remove 
the 5/-terminal triphosphate from the RNA substrate. For SARS-CoV 
and HCoV-229E nsp13, this first step in viral cap synthesis was shown to 
rely on the same NTPase active site that provides the energy for the protein’s 
helicase activity (Ivanov and Ziebuhr, 2004; Ivanov et al., 2004b). The CoV 
capping pathway is discussed in Section 5 of this review. 

Finally, the remarkable similarities between EAV nsp10 and the cellular 
helicase Upfl (Deng et al., 2014) have given rise to the intriguing but still 
speculative hypothesis that the nidovirus helicase may be involved in the 
posttranscriptional quality control of viral RNA. Common features of the 
two helicases include their 5’-to-3’ polarity of unwinding, their lack of sub- 
strate specificity and striking similarities in terms of domain organization and 
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fold (Fig. 4) (Lehmann et al., 2015c). Using several pathways, including 
nonsense-mediated mRNA decay, Upfl mediates RNA quality control 
in eukaryotic cells (Cheng et al., 2007; Clerici et al., 2009), while its activity 
can be modulated through interactions of its N-terminal ZBD. It was pos- 
tulated that a similar function in nidovirus replication, e.g., detection and 
elimination of defective viral mRNAs (including the genome), could 
explain the conservation of the unique ZBD across the nidovirus order, 
which stands out for containing members with very large + RNA genomes. 
Such a function would prevent the synthesis of defective viral polypeptides, 
which might interfere with the proper functioning of full-length viral pro- 
teins. In this manner, not unlike the nsp14-ExoN domain (ee earlier), the 
nidovirus helicase may have contributed to genome expansion by providing 
a form of compensation for the relatively poor fidelity of genome replication 
by the nidoviral RdRp (Deng et al., 2014). Clearly, this is just one of the 
scenarios for the involvement of the helicase in the posttranscriptional fate 
of viral RNA products. Further experimental work will be needed to 
explore these possibilities in more detail, as they are compatible with the 
much broader realization that the functions of RNA helicases can extend 
far beyond merely the unwinding of RNA structure. 


4.5 The Coronavirus Helicase as Drug Target 


Due to its multifunctionality and involvement in several key processes in 
viral RNA synthesis and processing, the nidovirus helicase is an important 
target for antiviral drug development, which was mainly explored for CoVs 
following the SARS-CoV outbreak. The highly conserved nature of the 
helicase offers the interesting perspective of developing inhibitors with a 
potential broad-spectrum activity. On the other hand, avoiding toxicity 
resulting from inhibition of the abundant cellular NTPases/helicases poses 
a serious challenge, which is why—as in the case of the RdRp—obtaining 
a crystal structure for a CoV helicase should be considered a research prior- 
ity. In theory, a variety of helicase properties may be targeted with specific 
inhibitors, ranging from the active and nucleic acid-binding sites of the 
enzyme to interaction surfaces for multimerization and modulation by pro- 
tein cofactors (Kwong et al., 2005). Several compound families were found 
to target the ATPase of nsp13, and thus also its helicase activity. These 
include naturally occurring flavonoids (Yu et al., 2012), chromones (Kim 
et al., 2011), and bananins (Kesel, 2005; Tanner et al., 2005), all exhibiting 
in vitro ICs, values in the low-micromolar range. Other compounds appear 
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to target the helicase of nsp13 activity more specifically, like the triazole 
SSYA10-001, which was found to inhibit the replication of multiple beta- 
CoVs (SARS-CoV, MERS-CoV, and MHV) in cell culture-based infection 
models (Adedeji et al., 2012b, 2014). The ICs value in in vitro helicase 
assays was about 5.5 uM, whereas ECs values in cell cultures assays were 
in the range of 7-25 uM, depending on the CoV analyzed, suggesting that 
broad-spectrum activity may indeed be achieved. To our knowledge, the 
antiviral potential and toxicity of SSYA10-001 in animal models remain 
to be tested. Another interesting group of helicase-directed antiviral hits 
are bismuth complexes, which were postulated to inhibit the NTPase and 
helicase functions by competing for zinc ions with the ZBD. In SARS- 
CoV-infected cell cultures the determined ECs) and CCs,) values were 
6 pM and 5 mM, respectively (Yang et al., 2007). 


5. THE CORONAVIRUS CAPPING MACHINERY: 
nsp10—13-—14—16 


Cap structures consists of a 7-methylguanosine (’G) linked to the 
first nucleotide of the RNA transcript through a 5’—5’ triphosphate bridge 
(for a review, see Decroly et al., 2012). In eukaryotic cells, the synthesis of 
the cap structure is a multistep process that occurs in the nucleus and is 
coupled to RNA pol U-driven transcription (Shatkin, 1976). Capping 
begins with the hydrolysis of the 5’ y-phosphate of the nascent RNA tran- 
script by an RNA 5/-triphosphatase (RTPase). Subsequently, a GMP mol- 
ecule is transferred to the 5’-diphosphate of the RNA by a GTase, leading to 
the formation of GpppN-RNA. The cap structure is methylated at the N7 
position of the guanosine by an (AdoMet)-dependent N7-MTase, yielding 
cap-0 ("’GpppN). The cap-0 structure is then converted into cap-1 
(°’GpppNz_om) or cap-2 by an AdoMet-dependent 2'-O-MTase that 
methylates the 2’-O position of the ribose of the first or first and second 
RNA nucleotide, respectively. 

Due to the cytoplasmic localization of their mRNA synthesis, 
nidoviruses, and all other +RNA viruses of eukaryotes, cannot rely on 
the standard capping pathway outlined earlier, which is executed by host cell 
enzymes in the nucleus. Cap structures can protect viral mRNAs from deg- 
radation by the cellular 5’-to-3’ exoribonuclases involved in RNA turnover 
(Liu and Kiledjian, 2006). Cap methylation is critical for mRNA recogni- 
tion by translation initiation factor eIF4E (Filipowicz et al., 1976; Ohlmann 
et al., 1996), and thus for viral translation and replication as a whole (Ferron 
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et al., 2012). In addition to promoting mRNA stability and securing their 
recognition by host cell ribosomes, capping of viral mRNAs also promotes 
escape from certain antiviral responses of the host cell. The retinoic acid- 
inducible gene 1 and melanoma differentiation-associated protein 5 
(Mda5) were shown to detect either uncapped triphosphorylated RNA 
or cap-O-containing RNA (Devarkar et al., 2016; Hyde and Diamond, 
2015; Hyde et al., 2014; Schuberth-Wagner et al., 2015; Zust et al., 
2011), resulting in the expression of antiviral interferon-stimulated genes 
(ISGs) in infected and neighboring cells. Interferon-induced proteins with 
a tetratricopeptide repeat 1 (IFIT1/56) and IFIT2/54 (IFIT2) have been 
shown to recognize miscapped RNAs, in order to restrict viral translation 
(Daffis et al., 2010; Pichlmair et al., 2011). A subsequent study identified 
IFIT1 as the only interferon-induced protein whose RNA-binding affinity 
was affected by the ribose-2’-O methylation state of the 5’ cap structure 
(cap-O/cap-1). The data support a model in which IFIT1 efficiently binds 
and sequesters capped mRNA that lacks a ribose-2’-O-methyl group. Con- 
sistent with this, viral mRNA translation was shown to be reduced in cells 
infected with 2’-O-MTase-deficient CoVs (Habjan et al., 2013). Other 
studies suggested that 2’-O methylation of cap structures prevents or 
delays the Mda5-dependent recognition of viral mRNAs as “nonself.” 
This mRNA cap modification thus limits the antiviral response launched 
upon infection, thereby affecting viral pathogenesis (Daffis et al., 2010; 
Schuberth-Wagner et al., 2015; Zust et al., 2011). 

The genomic and ss mRNAs of nidoviruses are presumed to be capped 
at their 5’ end and polyadenylated at their 3’ end (Fig. 1). The presence of a 
cap structure was first suggested based on studies using **P-labeled MHV 
RNA that was digested with RNase T1 and T2 and subjected to DEAE- 
cellulose chromatography (Lai and Stohlman, 1981). The presence of a 
cap was further substantiated by immunoprecipitation experiments using 
a cap-specific monoclonal antibody that was shown to specifically bind to 
equine torovirus mRNAs (van Vliet et al., 2002). Although the (presumed) 
capping machinery of arteriviruses has remained essentially uncharacterized 
thus far (Lehmann et al., 2015b), three conserved putative capping enzymes 
were identified in the conserved ORF1b-encoded part of the replicase of 
Coronaviridae, Roniviridae, and Mesoniviridae, which all have substantially 
larger genomes. These enzymatic activities, which were proposed to partic- 
ipate in the synthesis of a cap-1 structure (“’"GpppN2om), are the follow- 
ing: (i) the nsp13 helicase/RTPase (Ivanov and Ziebuhr, 2004), (ii) the 
nsp14 N7-MTase (Chen et al., 2009c; Ma et al., 2015), and (iii) the 
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nsp16 2'-O-MTase (Bouvet et al., 2010; Decroly et al., 2008; Snijder et al., 
2003; von Grotthuss et al., 2003; Zeng et al., 2016). 

At present, structural and functional studies, which were mainly per- 
formed with purified recombinant SARS-CoV nsps, suggest that CoVs fol- 
low a capping pathway that is very similar to that of eukaryotic cells. Cap 
synthesis is presumed to start by hydrolysis of the 5’ end of a nascent 
RNA by the RTPase function of nsp13 to yield pp-RNA (Ivanov and 
Ziebuhr, 2004). Then, a still elusive GTase must transfer a GMP molecule 
onto the pp-RNA to yield Gppp-RNA. The cap structure is then methyl- 
ated at the N7 position by the N7-MTase domain of the bifunctional nsp14 
(Bouvet et al., 2010; Chen et al., 2009c). Cap modification is completed by 
the conversion of the cap-0 into a cap-1 structure, which involves the 
nsp10/nsp16 complex (Bouvet et al., 2010; Chen et al., 2009c) in which 
nsp16 possesses 2’/-O-MTase activity. In the following paragraphs, we will 
describe the different CoV enzymes involved in mRNA capping in more 
detail. 


5.1 The nsp13 RNA 5’ Triphosphatase 


As described earlier, the nsp13 helicase domain is thought to unwind 
double-stranded RNA in a 5/-to-3’ direction, an activity that energetically 
depends on the nucleotide triphosphatase (NTPase) function of the enzyme 
(Ivanov and Ziebuhr, 2004; Ivanov et al., 2004b; Seybert et al., 2000a). 
Additionally, using the same active site, the protein was found to exhibit 
RTPase activity (Ivanov and Ziebuhr, 2004), which was found to be 
abolished if the conserved active-site Lys residue of the Walker A box motif 
was replaced with Ala (Ivanov and Ziebuhr, 2004; Ivanov et al., 2004b). 
Specific RTPase activity on viral mRNA species would require specific 
recruitment of nsp13 to the 5’ end of viral mRNAs, which has not been 
demonstrated for CoV helicases but may involve yet other factors. In this 
context, it remains to be studied if the common leader sequence present 
on CoV mRNAs contributes to the recruitment of nsp13 and/or other pro- 
teins involved in 5’ capping reactions. 

Several other +RNA virus helicases were shown to possess an activity that 
can target the phosphodiester bond between the B and y phosphate groups of 
the 5/-terminal NTP of diverse substrate RNAs, suggesting that this dual 
function of helicase and capping RTPase is a common feature in this group 
of viruses (Decroly et al., 2012; Ivanov and Ziebuhr, 2004). Even though 
experimental evidence has been obtained to suggest an nsp13-associated 5’ 
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RTPase activity, coronavirus nsp13 homologs proved to be unable to trans- 
complement the yeast strain YBS20, which lacks the CET1 locus encoding 
the yeast RTPase involved in mRNA capping (Chen et al., 2009c). The lack 
of trans-complementation by nsp13 could be due to technical reasons, such as 
inappropriate subcellular localization of nsp13, misfolding of the protein, or a 
functional mismatch with other players of the distantly related yeast capping 
machinery. Alternatively, it could indicate specific substrate requirements for 
coronavirus nsp13-associated RTPase activities. In any case, a possible 
involvement of nsp13 in the first step of CoV mRNA capping remains to 
be corroborated in further studies, for example, by in vitro reconstitution 
experiments of the entire CoV capping pathway. 


5.2 The Elusive RNA GTase 


The CoV nsp involved in the second step of RNA capping, GTase, remains 
to be identified. Bioinformatics analysis of CoV genome sequences failed to 
identify replicase gene-encoded domains that may perform this activity. 
Eukaryotic RNA capping enzymes belong to the ligase family and have been 
shown to form a GTase-GMP adduct upon incubation with GTP (Decroly 
et al., 2012). A substantial number of SARS-CoV nsps were expressed and 
purified (nsp7, nsp8, nsp10, and nsp12—16), but covalent linkage of GMP to 
any these proteins could not be demonstrated (Jin et al., 2013). In addition, 
nsps were screened for GTase activity in a yeast trans-complementation 
system using the YBS2 strain lacking the gene (ceg1) encoding the yeast 
GTase (Chen et al., 2009c), but also this powerful approach failed to identify 
the CoV GTase. Consequently, several hypotheses remain to be explored. 
First, it is possible that the N-terminal NiRAN domain of nsp12 (see 
earlier) forms a covalent adduct with GTP, as observed for its arterivirus 
homolog nsp9 (Lehmann et al., 2015a). Another possibility is that the 
CoV capping pathway is unconventional and, for example, resembles that 
of alphaviruses in which the GTP molecule needs to be methylated at its 
N7 position before the GTase-""GMP adduct can be formed (Ahola and 
Ahlquist, 1999). Interestingly, this second possibility might explain nsp14s 
capability to convert GTP into "GTP (see later) (Jin et al., 2013). Finally, 
the involvement of a host GTase remains an interesting possibility, in par- 
ticular since cytoplasmic forms of cellular capping enzymes have been 
described recently (Mukherjee et al., 2012; Schoenberg and Maquat, 
2009). Further work is needed to explore these various hypotheses and 
resolve this important question. 
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5.3 The nsp14 N7-Methyl Transferase 


Coronavirus nsp14 is a bifunctional protein that plays a crucial role in viral 
RNA synthesis. Its N-terminal exonuclease (ExoN) domain (Minskaia et al., 
2006; Snider et al., 2003), which is thought to promote the fidelity of CoV 
RNA synthesis, will be discussed in more detail in Section 6. The C-termi- 
nal part of nsp14 carries an AdoMet-dependent guanosine N7-MTase activ- 
ity. Interestingly, as in the case of the GTase (see earlier), bioinformatics 
analyses of CoV genome sequences failed to identify proteins or protein 
domains related to cellular and/or viral N7-MTases, again illustrating the 
significant divergence of nidoviruses from other viral and cellular systems. 
However, using a trans-complementation assay and a yeast strain lacking 
the abd1 (N7-MTase) gene, Guo et al. discovered a SARS-CoV nsp14-asso- 
ciated N7-MTase activity (Chen et al., 2009c) by demonstrating that the 
protein was able to restore the growth of the Aabd1 yeast mutant. They also 
showed that a range of alphacoronavirus nsp14 homologs were able to com- 
plement the defect of the Aabd1 yeast strain. The N7-MTase activity of 
nsp14 was subsequently confirmed and characterized using purified recom- 
binant enzymes (Bouvet et al., 2012; Chen et al., 2009c). The SARS-CoV 
N7-MTase was shown to methylate 5’ cap structures in a sequence- 
independent manner using a range of RNAs and it also proved to be active 
on cap analogues and GTP (Jin et al., 2013), corroborating the trans-com- 
plementation experiments in yeast in which rescue required efficient meth- 
ylation of a wide range of cellular RNAs. In contrast to the ExoN activity, 
the in vitro N7-MTase activity was not found to be affected by interactions 
with nsp10 (Bouvet et al., 2010). 

The N7-MTase domain was further characterized by alanine scanning 
mutagenesis and key residues for enzymatic activity were identified 
(Chen et al., 2009c) including 10 crucial residues distributed throughout 
the domain and two clusters of residues essential for MTase activity 
(Fig. 5). The first cluster (nsp14 residues 331-336) corresponds to the 
DXGXPXA motif of the AdoMet-binding site. In cross-linking experi- 
ments, mutations in this motif strongly decreased the binding of *H-labeled 
AdoMet. The role of the second cluster, between residues 414 and 428, was 
revealed by X-ray structure analysis of a SARS-CoV nsp10/nsp14 complex 
expressed in E. coli (Fig. 6) (Ma et al., 2015). These residues form a con- 
stricted pocket that holds the cap structure (GpppA) between two B-strands 
(B1 and B2) and helix 1, placing the N7 position of the guanine in close prox- 
imity of AdoMet and ready for methyl transfer using an in-line mechanism. 
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Fig. 5 Sequence alignment of coronavirus nsp14 homologs representing three genera 
of the Coronavirinae subfamily: SARS-CoV (genus Betacoronavirus), HCoV-229E (genus 
Alphacoronavirus), and IBV (genus Gammacoronavirus). The alignment was generated 
using Clustal Omega (Sievers et al., 2011) and rendered using ESPript version 3.0 
(Robert and Gouet, 2014). Conserved ExoN motifs |, Il, and Ill and clusters of residues 
involved in SAM binding and N7-MTase activity (1 and 2) are highlighted in gray. Cat- 
alytic residues of ExoN and residues involved in the formation of zinc fingers are indi- 
cated by asterisks and arrowheads, respectively. Also shown are the secondary structure 
elements of SARS-CoV nsp14 (pdb 5C8S) and the border between the N-terminal ExoN 
and C-terminal N7-MT (NMT) domains. 
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Fig. 6 Surface representation of the three-dimensional structure of the nsp10/nsp14 
complex (pdb 5C8S). The nsp10 ribbon structure is shown with conserved residues col- 
ored in blue (using a scale from dark to light blue). The coloring of the nsp14 surface is 
based on the conservation of the respective residues among CoVs (using a scale from dark 
to light red). The upper panel shows the surface containing the ExoN catalytic site with one 
Mg?* ion bound in the active site (green sphere). The lower panel shows the opposite side 
of the structure with the N7-MTase active site. The cap analog GpppA and SAH are shown 
in stick representation. The figures were generated using UCSF Chimera (Pettersen et al., 
2004). The degree of conservation of specific residues was determined using an align- 
ment of nsp10 and nsp14 sequences of eight coronaviruses representing the four genera 
of the Coronavirinae subfamily (SARS-CoV, MERS-CoV, MHV, TGEV, FCoV, HCoV-229E, IBV, 
and bulbul coronavirus HKU11). 


The structure also revealed that the nsp14 N7-MTase domain exhibits a 
noncanonical MTase fold. Whereas the canonical fold contains a 7-strand 
B-sheet that is commonly present in the Rossman fold, nsp14 contains only 
a 5-strand B-sheet and an insertion of a 3-strand antiparallel B-sheet between 
B5 and Bo. In line with previous mutagenesis data (Chen et al., 2009c), the 
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nsp14 X-ray structure revealed functionally relevant interactions between 
the N-terminal (ExoN) and C-terminal (N7-MTase) domains, with three 
a-helices of ExoN stabilizing the base of the N7-MTase substrate-binding 
pocket. 

The specific role of N7-MTase activity in virus replication was supported 
by reverse genetics. Nsp14 mutations were introduced in SARS-CoV RNA 
replicons expressing a luciferase reporter gene. A D331A mutation in the 
AdoMet-binding site, which blocks the N7-MTase activity of nsp14, was 
shown to reduce the luciferase expression (by 90%), indicating that viral 
RNA replication and/or transcription were impaired in this in vitro system 
(Chen et al., 2009c). 

The nsp14 N7-MTase is an attractive target for antiviral strategies, espe- 
cially because it exhibits a range of features that are distinct from host cell 
MTases (Ferron et al., 2012). The druggability of the enzyme was explored 
using a small set of previously documented MTase inhibitors. These in vitro 
assays revealed that AdoHcy (the coproduct of the methylation reaction), 
sinefungin (another AdoMet analog), and ATA efficiently inhibited nsp14 
N7-MTase activity with ICs) values of 12 WM, 39.5 nM, and 2.1 uM, 
respectively (Bouvet et al., 2010). ATA was also shown to limit SARS- 
CoV replication in infected cells (He et al., 2004). In the yeast-AMTase 
trans-complementation assay mentioned earlier, micromolar concentrations 
of sinefungin were reported to effectively suppress the nsp14 N7-MTase 
activity of SARS-CoV, MHV, transmissible gastroenteritis virus (TGEV), 
and IBV (Sun et al., 2014). However, other compounds, such as ATA 
and AdoHcy, did not exert an inhibitory effect in the context of yeast cells. 
These discrepancies may reflect differences in cell penetration of the com- 
pounds between yeast and (virus-infected) mammalian cells. The yeast sys- 
tem was also applied to screen a library of 3000 natural product extracts, and 
three hits were obtained displaying potent inhibitory effects on the CoV 
N7-MTase (Sun et al., 2014). Further work is needed to optimize these hits 
and test their inhibitory activities in assays using CoV-infected cells. 


5.4 The nsp16 2’-O-Methyl Transferase 


The presence of a 2'-O-methy]l transferase (2/-O-MTase) domain in CoV 
nsp16 was first inferred using bioinformatics (Snider et al., 2003; von 
Grotthuss et al., 2003). Computational threading produced a model con- 
taining a conserved K-D—K-E catalytic tetrad that is characteristic of 
AdoMet-dependent 2'-O-MTases and a conserved AdoMet-binding site 
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(Fig. 7) (von Grotthuss et al., 2003). The 2’-O-MTase activity was then con- 
firmed by in vitro biochemical assays using purified FCoV nsp16 (Decroly 
et al., 2008). The recombinant protein was shown to specifically recognize 
short, cap-O-containing RNAs and to transfer a methyl group from AdoMet 
to the 2'-O position of the first nucleotide of the N7-methylated substrate. 
In contrast, a recombinant form of the SARS-CoV nsp16 homolog was 
inactive under similar experimental conditions. Since yeast two-hybrid 
experiments and GST pull-down assays had revealed that SARS-CoV 
nsp16 strongly interacts with nsp10 (Imbert et al., 2008; Pan et al., 2008), 
a possible involvement of the latter in regulating or supporting the 2’-O- 
MTase activity was tested. It was shown that purified nsp10 interacts with 
nsp16 in vitro and thereby triggers a robust 2’/-O-MTase activity (Bouvet 
et al., 2010). Effective methyl transfer was demonstrated using synthetic 
capped N7-methylated RNA and longer RNA mimicking the 5’ end of 
the SARS-CoV genome (Bouvet et al., 2010). In contrast, RNA with a 
nonmethylated cap structure (Gppp-RNA) was not bound by the nsp10/ 
nsp16 complex and, consequently, could not serve as a substrate. These 
observations suggest that SARS-CoV mRNA capping follows a strict order 
of reaction steps: after GTP transfer by the still elusive GTase, the cap is 
methylated by the nsp14 N7-MTase at the guanosine-N7 position to pro- 
duce a cap-0 structure that, in a subsequent reaction, is bound by the nsp10/ 
nsp16 complex and converted to a cap-1 structure employing the 2’-O- 
MTase activity of nsp16. 

Mutagenesis of SARS-CoV nsp10 and nsp16 confirmed the importance of 
the catalytic tetrad of the nsp16 2’-O-MTase (Decroly et al., 2011) and 
showed that the nsp10—nsp16 interaction is absolutely required for activity 
(Decroly et al., 2011; Lugari et al., 2010). Crystallographic studies of the 
nsp10/nsp16 complex revealed the molecular basis for the stimulation of 
the nsp16 2'-O-MTase activity by nsp10 (Fig. 7) (Chen et al., 2011; 
Decroly et al., 2011). The CoV 2’-O-MTase belongs to the RrmJ/fibrillarin 
superfamily of ribose 2'-O-methy] transferases (Feder et al., 2003) which have 
a number of viral orthologs in flaviviruses, alphaviruses, and other nidoviruses. 
As mentioned earlier, this family contains a conserved K-D—-K-E catalytic tet- 
rad (Fig. 7) that is located in close proximity to the substrate-binding pocket 
accommodating the RNA substrate. The structure revealed that nsp10 is an 
allosteric regulator that stabilizes nsp16. Moreover, structural and biochemical 
analyses indicated that nsp10 binding extends and narrows the RNA-binding 
groove that accommodates the RNA substrate, thereby promoting the RNA- 
and AdoMet-binding capabilities of nsp16 (Fig. 7). 
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The functional relevance of 2'-O-MTase regulation by nsp10 for virus 
replication is not yet understood. The nsp16-associated 2’-O-MTase activ- 
ity (itself) is highly conserved across the Coronaviridae family and its func- 
tional relevance has been supported by reverse-genetics studies using 
genetically engineered alpha- and betacoronavirus mutants that lack 2’- 
O-MTase activity (Devarkar et al., 2016; Hyde and Diamond, 2015; 
Schuberth-Wagner et al., 2015; Zust et al., 2011). Furthermore, the pheno- 
type of temperature-sensitive MHV mutants suggested nsp16 to play a role 
in RNA synthesis or in the stability of plus-strand RNA (Sawicki et al., 
2005). An early SARS-CoV study was able to show that the insertion of 
a stop codon immediately upstream of the nsp16-coding sequence blocked 
viral RNA synthesis (Almazan et al., 2006). Subsequently, for HCoV-229E, 
MHV, and SARS-CoV, several mutants with reduced or ablated 2/-O- 
MTase activity were described that generally retained robust viral replication 
in cell culture (Menachery et al., 2014; Zust et al., 2011). The studies also 
revealed that the impact of nsp16 mutations may depend on the cell types 
used and that the lack of 2/-O-MTase activity causes more profound effects 
in primary cells and immune cells. The SARS-CoV nsp16 mutants were fur- 
ther characterized in infected mice and showed a robust attenuation as 
judged by viral titers, weight loss, lung histology, and respiratory function. 
The nsp16 mutants also displayed increased sensitivity to treatment with 
type I interferon. This was also observed for the corresponding MHV 
and HCoV-229E nsp16 mutants (Zust et al., 2011). However, in contrast 
to the latter study, the SARS-CoV nsp16 mutant was not found to induce 
type I interferons, either in vitro or in vivo (Menachery et al., 2014). This 
observation suggests that SARS-CoV may have a larger repertoire of func- 
tions for preventing induction of type I interferons following cellular sensing 
of “nonself’? RNAs, such as viral RNAs with incompletely methylated 5’ 
cap structures. Together, these data established that the highly conserved 
nsp16 2/-O-MTase plays an important role in limiting the detection of viral 


Fig. 7—Cont’d complex (pdb 2XYV). Nsp10 is shown in ribbon representation with con- 
served residues colored in dark to light blue according to their conservation among 
CoVs. Zinc molecules are shown as spheres and zinc-coordinating residues are shown 
in stick representation. The surface of nsp16 is colored in dark to light red according 
to the conservation of the respective residues among coronaviruses. SAH is depicted 
in a stick model. The figure was generated using UCSF Chimera (Pettersen et al., 
2004). The degree of conservation of specific residues was determined using an align- 
ment of nsp10 and nsp16 sequences of eight coronaviruses (see Fig. 6). 
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RNA by the host’s antiviral sensors, but that the specific role of this activity 
in escaping host innate immune responses may differ to some extent among 
CoVs. The mechanism underlying the induction of the antiviral immune 
response following infection with 2’-O-MTase knockout mutants was stud- 
ied using specific knockout mice in which viral replication was found to be 
restored. The absence of Mda5, a cap-0 sensor, restored MHV replication, 
whereas the nuclear translocation of IRF3 and interferon induction were 
reduced (Zust et al., 2011). This suggested Mda5 to function in the primary 
recognition of the RNA produced by CoV 2'-O-MTase mutants and to ini- 
tiate the ISG cascade that restricts viral replication. Among the ISG products, 
the IFIT family of proteins was shown to be critically involved in reducing 
the replication of CoV nsp16 mutants. The replication of MHV and HCoV- 
229E nsp16 mutants and wild-type controls was similar in IFIT1—/— 
knockout mice (Habjan et al., 2013; Zust et al., 2011). Likewise, replication 
of a SARS-CoV Ansp16 mutant was increased in both IFIT1 and IFIT2 
knockdown mice (Menachery et al., 2014), suggesting that IFIT family pro- 
teins mediate the primary attenuation of SARS-CoV 2/-O-MTase knock- 
out mutants. 

The earlier results indicate that the nsp16 2’-O-MTase constitutes a new 
and attractive target for the development of antiviral drugs against SARS- 
CoV and HCoV-229E, as well as newly emerging CoVs like MERS- 
CoV and porcine epidemic diarrhea virus (PEDV). For example, the 
nsp10—nsp16 interface may be targeted to limit viral 2'-O-MTase activity 
and thus restore the antiviral responses mediated by Mda5 and IFIT1 
(Menachery and Baric, 2013). Interestingly, the nsp10 residues involved 
in the nsp10/nsp16 interaction are quite conserved within the CoV family 
and it was recently demonstrated that nsp10 of different CoVs (FCoV, 
MHV, SARS-CoV, MERS-CoV) is functionally interchangeable in the 
stimulation of nsp16 2'-O-MTase activity (Wang et al., 2015). Thus, 
molecules or peptides blocking this interface may have broad-spectrum anti- 
CoV effects, a concept that was explored and supported using synthetic 
peptides that mimic the nsp10 interface and suppress nsp16 2’-O-MTase 
activity in vitro (Ke et al., 2012; Wang et al., 2015). The antiviral effect 
of the MHV TP29 peptide, for example, was first demonstrated in 
MHV-infected cells and was subsequently confirmed to limit MHV repli- 
cation in mice and to enhance the type I interferon response (Wang 
et al., 2015). The same peptide was also effective in blocking the replication 
of a SARS-CoV replicon. 
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6. CORONAVIRUS nsp14 ExoN: KEY TO A UNIQUE 
MISMATCH REPAIR MECHANISM THAT PROMOTES 
FIDELITY 


The CoV ExoN domain was identified on the basis of comparative 
sequence analyses (Snijder et al., 2003) that suggested a distant relationship 
of the nsp14 N-terminal domain (and equivalent polyprotein regions of 
other nidoviruses) with cellular DEDD exonucleases, a large protein super- 
family containing RNA and DNA exonucleases from all kingdoms of life 
(Zuo and Deutscher, 2001). The designation “DEDD” alludes to four 
invariant Asp/Glu residues that are part of three sequence motifs, I-III, that 
are conserved in members of this superfamily (Fig. 5). The DEDD super- 
family is also referred to as DnaQ-like family because it includes DnaQ, 
the € subunit of E. coli DNA polymerase III, a well-characterized proofread- 
ing enzyme (Echols et al., 1983; Scheuermann et al., 1983). Conservation of 
a fifth residue, His, located four positions upstream of the conserved Asp in 
motif II identifies ExoN as a member of the DEDDh subgroup, while 
members of the DEDDy exonucleases contain Tyr at the equivalent posi- 
tion. The acidic residues are required to form two metal-binding sites. Based 
on catalytic models initially developed for cellular exonucleases and catalytic 
RNA (Beese and Steitz, 1991; Steitz and Steitz, 1993), the conserved His 
and the site A metal ion are thought to activate a water molecule that 
launches a nucleophilic attack on the phosphorus group of the 3’-terminal 
phosphodiester, while the site B metal ion is thought to stabilize the transi- 
tion state (Derbyshire et al., 1991). 

ExoN is conserved in CoVs and all other known nidoviruses with 
genome sizes of >20 kb (Gorbalenya et al., 2006; Minskaia et al., 2006; 
Nga et al., 2011; Snider et al., 2003; Zirkel et al., 2011). The correlation 
between genome size and ExoN conservation suggests that, in nidoviruses 
with medium- and large-size genomes, the correct nucleotide selection and 
recognition of properly formed base pairs by the RdRp is not enough to 
accomplish the necessary replication fidelity and, therefore, requires addi- 
tional functions suitable to detect and remove misincorporated nucleotides. 
Recently, biochemical evidence has been provided to suggest that ExoN 
may have exactly this function (Bouvet et al., 2012). CoV mutants that 
lack ExoN activity provided additional evidence for ExoN being 
involved in mechanisms that keep the CoV mutation rate at a relatively 


low level (<10~° mutations per site per round of replication for MHV 
Pp Pp p 
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and SARS-CoV) (Eckerle et al., 2007, 2010), while other RNA viruses have 
much higher mutation rates, ranging from 10° to 10° mutations per site 
per round of replication (Drake and Holland, 1999; Sanjuan et al., 2010). 
ExoN knockout mutants of SARS-CoV and MHV were shown to display 
a mutator phenotype with significantly increased mutation frequencies 
approaching those of other RNA viruses (Eckerle et al., 2007, 2010). Con- 
sidering that these studies were performed under selection pressure favoring 
genotypes with high replication efficiency, the total number of mutations 
in RNAs produced by ExoN-deficient viruses may be even higher than cal- 
culated in that study, especially in genome regions that are not subject to 
selection in in vitro cell culture systems. While inactivation of ExoN activity 
was tolerated by MHV and SARS-CoV (albeit with reductions in replica- 
tion efficiency), stable ExoN-deficient mutants of the alphacoronaviruses 
HCoV-229E and TGEV could not be recovered (Becares et al., 2016; 
Minskaia et al., 2006), supporting the critical role of this activity in CoV 
replication. Taken together, the available information provides compelling 
evidence for ExoN playing a key role in high-fidelity replication of CoVs. 
Consistent with this hypothesis, genetically engineered ExoN knockout 
mutants were shown to be significantly more sensitive to RNA mutagens 
such as ribavirin and 5-fluorouracil (up to 300-fold). Furthermore, 
compared to wild-type virus, the ExoN knockout mutants were shown 
to accumulate a much higher number of mutations when propagated in 
the presence of mutagens (Smith et al., 2013). The lack of ExoN activity 
was also shown to have profound effects on viral replication and pathogen- 
esis in vivo (Graham et al., 2012). ExoN-negative mutants displayed a 
stable mutator phenotype in a number of mouse models of human SARS, 
providing a promising approach for the stable attenuation of highly patho- 
genic CoVs with important implications for vaccine development (Graham 
et al., 2012). 

Coronavirus nsp14 is a bifunctional protein comprised of an N-terminal 
ExoN domain and a C-terminal N7-MTase domain. Surprisingly, the latter 
domain is not conserved in the Torovirinae (genera Torovirus and Bafinivirus), 
representing the other subfamily of the Coronaviridae, but in other (more dis- 
tantly related) nidovirus branches (Lauber et al., 2013; Nga et al., 2011). This 
conservation pattern suggests that a common ancestor of the Corona-, 
Mesoni-, and Roniviridae contained the two-domain ExoN/N7-MTase 
structure while some lineages lost the N7-MTase domain at a later stage. 
Although nsp14 does not require other proteins for activity (Chen et al., 
2007; Minskaia et al., 2006), its ribonucleolytic (but not the N7-MTase) 
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activity was shown to be stimulated significantly in the presence nsp10. In 
line with this, nsp10 variants carrying amino acid substitutions that prevent 
the interaction with nsp14 failed to stimulate ExoN activity (Bouvet et al., 
2012, 2014). To date, there is no evidence to suggest a direct role for nsp10 
in catalysis. Most likely, interactions between the nsp10 and the N-terminal 
domain of nsp14 stabilize the ExoN active site in a catalytically competent 
conformation. Mutagenesis data and a recent X-ray structure analysis of a 
SARS-CoV nsp10/nsp14 complex (see Fig. 6) revealed that the nsp10 sur- 
face required for interaction with nsp14 overlaps with the surface involved 
in the interaction and activation of the nsp16 2'-O-MTase activity (see ear- 
lier) (Bouvet et al., 2014; Ma et al., 2015). 

Coronavirus ExoN activities were first demonstrated using recombinant 
forms of SARS-CoV nsp14 expressed in E. coli (Minskaia et al., 2006). The 
protein was shown to require Mg~* or Mn’" ions for activity and to degrade 
a range of single-stranded (ss) synthetic RNAs with 3'-to-5’ directionality to 
yield reaction products of about 8-12 nucleotides. The data further 
suggested that RNA secondary structure affects ExoN activity (Minskaia 
et al., 2006). Mutational analysis of predicted active-site (Asp/Glu) residues 
confirmed their critical involvement in catalysis (Minskaia et al., 2006). Ina 
subsequent study, using nsp10/nsp14 complexes, the substrate specificity 
was characterized in more detail and revealed dsRNA with a terminal mis- 
match to be the preferred substrate for ExoN activity. Excision efficiencies 
using different mismatched base pairs (A:G, A:A, A:C, U:G, U:C, U:U) 
were found to be similar, suggesting that the mismatch rather than the nature 
of the nucleotide misincorporated at the 3’ end determines ExoN activity 
(Bouvet et al., 2012). 

In a recent study, the structures of unliganded, SAM-bound, and SAH- 
GpppA-bound SARS-CoV nsp10—nsp14 complexes were determined by 
X-ray crystallography (Ma et al., 2015). The structures provide important 
insight into the two-subdomain structure of nsp14, the two catalytic sites 
of ExoN and N7-MTase, critical substrate-binding residues, the contribu- 
tion of nsp10 to enhancing ExoN activity, and the roles of as many as three 
zinc fingers present in nsp14 (Fig. 6). In the structure, one molecule of nsp14 
was found to bind one molecule of nsp10. Given that nsp10 tends to form 
multimers (Su et al., 2006), it is tempting to speculate that, in infected cells, 
the nsp10—nsp14 complex may form even larger complexes, for example, by 
interacting with nsp10—nsp16 complexes that might be stabilized by nsp10— 
nsp10 interactions. In this way, consecutive methylation reactions of the 5’ 
cap structure could be spatially coordinated. Comparison of the nsp10 
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surfaces that interact with nsp14 and nsp16, respectively, revealed a signif- 
icant overlap (Bouvet et al., 2014), with a substantially larger surface being 
involved in interactions between nsp10 and nsp14. Buried solvent accessible 
areas for interactions with nsp14 and nsp16 were determined to be 2236 and 
938 A’, respectively (Decroly et al., 2011; Ma et al., 2015). The structure of 
the nsp10—nsp14 complex also helps to explain the observed stimulation of 
EXxoN activity by nsp10 (see earlier). Two regions of nsp10 interact exten- 
sively with different structural elements of the nsp14 N-terminal domain, 
most likely to maintain the structural integrity of the ExoN domain. Inter- 
estingly, the observed interaction of N-terminal residues of nsp10 with 
nsp14 led to interpretable electron density for these residues that had not 
been observed in previous structures of nsp10 (Joseph et al., 2007; Su 
et al., 2006) or the nsp10—nsp16 complex (Chen et al., 2011; Decroly 
et al., 2011), consistent with the proposed role of the N-terminal loop of 
nsp10 in stabilizing interactions with nsp14. 

The structure of the ExoN domain is essentially comprised of a twisted 
B-sheet that is formed by five B-strands and flanked by a-helices on either 
side. It resembles that of other DEDD superfamily exonucleases, such as 
the € subunit of E. coli DNA polymerase III, but also has unique features. 
These include two segments (residues 1-76 and 119-145) that are involved 
in the interaction with nsp10 and two zinc fingers in the ExoN domain (see 
Fig. 5). The second zinc finger was found in close proximity to the catalytic 
site. Both its position in the structure and mutagenesis data support a role for 
this zinc finger in catalysis (Ma et al., 2015). The other zinc finger appears to 
be required to maintain the structural integrity of nsp14. Consistent with 
this, nsp14 variants containing substitutions in zinc finger 1 proved to be 
insoluble when expressed in E. coli. 

Five residues predicted to coordinate two Mg” ions, Asp-90, Glu-92, 
Glu-191, His-268, and Asp-273, were found in the catalytic site, with one 
Mg ion being coordinated by Asp-90 (ExoN motif I) and Glu-191 (motif 
Il) (Fig. 5). The second Mg~" ion expected to be involved in the two-metal- 
ion-assisted catalytic mechanism of ExoN (Beese and Steitz, 1991; Chen 
et al., 2007; Ulferts and Ziebuhr, 2011) was not identified, presumably 
due to the lack of an RNA substrate and/or product in this structure. With 
one exception, metal ion-coordinating residues identified in the active site 
corresponded well to those identified in related cellular proofreading exo- 
nucleases (Beese et al., 1993; Hamdan et al., 2002) and previous predictions 
for nidovirus homologs (Snijder et al., 2003). The structure clearly revealed 
Glu-191 (instead of Asp-243) to be involved in catalysis, thus revising 
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previous predictions on the identity of ExoN motif II in the CoV nsp14 pri- 
mary structure and making nsp14 a “DEED outlier” in the DEDD super- 
family of exonucleases. 

The combined structural and functional information obtained for nsp14 
including its subdomains and the nsp10 cofactor provides an excellent basis 
for studies using even larger multisubunit complexes to obtain insight into 
the coordinated action of key replicative enzymes involved in RNA synthe- 
sis, quality control, capping, methylation, and other functions (Subissi et al., 
2014a). 


7. CORONAVIRUS nsp15: A REMARKABLE 
ENDORIBONUCLEASE WITH ELUSIVE FUNCTIONS 


The nsp15-associated endoU domain is one of the most conserved 
proteins among CoVs and related viruses (Fig. 8), suggesting important 
functions in the viral replicative cycle. Already in the first sequence analyses 
of torovirus and arterivirus replicase genes published more than 25 years ago 
(den Boon et al., 1991; Snijder et al., 1990), the identification of a conserved 
sequence in the 3/-terminal ORF1b region, including the (at the time 
unknown) endoU domain, was key to establishing phylogenetic relation- 
ships between corona- and toroviruses and, subsequently, also arteriviruses 
(Cavanagh and Horzinek, 1993; Snider et al., 1993). Outside the 
Nidovirales, no viral homologs of endoU have been identified to date. 
Together with the helicase-associated ZBD (see earlier), the nidoviral 
endoU has therefore been proposed to be a unique and universally con- 
served genetic marker common to all nidoviruses (Ivanov et al., 2004a; 
Snyder et al., 2003). Only recently, with the identification of the first 
nidoviruses in insects (now classified in the family Mesoniviridae) and 
reanalysis of the ronivirus replicase gene, it was found that endoU is not 
conserved in those nidovirus branches that replicate in invertebrate hosts 
(Mesoniviridae, Roniviridae) (Lauber et al., 2012, 2013; Nga et al., 2011; 
Zirkel et al., 2013), suggesting specific roles in vertebrate hosts. To date, 
these functions and, more specifically, the biologically relevant substrates 
of endoU have not been identified. Characterization of MHV endoU 
knockout mutants revealed only minor effects on viral RNA synthesis 
in infected cells, with all RNA species being equally affected, and caused 
a slight reduction in virus titers, which was most evident at later time points 
post infection (Kang et al., 2007). For the arterivirus EAV, substitutions of 
several conserved residues in the endoU domain were found to cause more 
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Fig. 8 Alignment of nidovirus endoU domains and XendoU from Xenopus laevis. Resi- 
dues involved in catalysis (*) and substrate binding (&) are indicated. Abbreviations not 
explained in the main text: EToV, Equine torovirus (subfamily Torovirinae, genus 
Torovirus); WBV, white bream virus (subfamily Torovirinae, genus Bafinivirus). 


profound effects, both in virus reproduction and viral RNA accumulation, 
with sg RNA being more affected than genome replication in several 
mutants. In some cases, reduction in virus titers by up to 5 log was 
observed (Posthuma et al., 2006). The (limited) information obtained in 
these studies suggests that endoU activity is not strictly required for 
nidovirus RNA synthesis, at least in cell culture. However, the strong con- 
servation clearly suggests an important in vivo function that remains to be 
identified in suitable model systems. Initial evidence for specific functions 
of nidovirus endoU domains in infected cells was obtained in studies show- 
ing that SARS-CoV nsp15, but surprisingly not the homologous proteins 
from HCoV-NL63 and HCoV-HKU1, counteracts MAVS-induced apo- 
ptosis (Lei et al., 2009). Further experiments will be necessary to confirm 
and assess the significance of these functions for virus replication and/or 
virus—host interactions. 

Nidovirus-encoded endoU activities have been characterized using 
recombinant forms of CoV nsp15 (SARS-CoV, HCoV-229E, MHV- 
A59, IBV) and arterivirus nsp11 (EAV, PRRSV) (Bhardwaj et al., 2004; 
Ivanov et al., 2004a; Kang et al., 2007; Nedialkova et al., 2009). In a number 
of studies, recombinant nidovirus endoUs were shown (i) to have endonu- 
cleolytic activity, (ii) to cleave 3’ of pyrimidines, preferring uridine over 
cytidine, and (iii) to release reaction products with 2',3/-cyclic phosphate 
and 5’-OH ends. Using suitable test substrates, RNA structural features were 
shown to affect endoU cleavage efficiency, with unpaired pyrimidines being 
processed mote efficiently (Bhardwaj et al., 2006; Nedialkova et al., 2009). 
The role of metal ions in endoU activity is not entirely clear, with somewhat 
contradictory data being reported for different homologs. While the activ- 
ities of cellular and CoV homologs were shown to require (or to be signif- 
icantly stimulated by) Mn?" ions (Bhardwaj et al., 2004; Ivanov et al., 2004a; 
Laneve et al., 2003, 2008), arterivirus endoU activities proved to be less 
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dependent on metal ions. Low concentrations of Mn** were found to stim- 
ulate only marginally the arterivirus endoU activities, whereas higher con- 
centrations (previously shown to be required for optimal nucleolytic 
activities in CoV and cellular homologs) inhibited the activities of EAV 
and PRRSV endoUs (Nedialkova et al., 2009). Metal ion requirements 
are commonly used to distinguish between the two basic catalytic mecha- 
nisms employed by ribonucleases: the metal-independent mechanism that, 
for example, is employed by RNase A and results in products with 2',3/- 
cyclic phosphate ends (as described earlier for endoU), and the metal- 
dependent mechanism in which catalysis is aided by two divalent cations 
coordinated by conserved acidic residues and generates products with 3’- 
OH and 5/-phosphate ends (as described earlier for ExoN). The critical 
(or supportive) role of metal ions observed for several cellular and viral 
endoU homologs is inconsistent with an R Nase A-like (metal-independent) 
reaction mechanism. Also, metal ions were not detected in any of the struc- 
tures determined for coronavirus nsp15s or XendoU, arguing against a direct 
role of metal ions in catalysis (Renzi et al., 2006; Ricagno et al., 2006). 
However, Mn?* ions were found to change the intrinsic tryptophan fluo- 
rescence of SARS-CoV nsp15, suggesting conformational changes that, 
potentially, may affect activity and were shown to be unrelated to protein 
multimerization (Bhardwaj et al., 2004; Guarino et al., 2005). Also, regard- 
ing the role of Mn** in RNA binding, contradicting data have been 
reported, with RNA binding by SARS-CoV nsp15 being enhanced in 
the presence of Mn~" or not affected by metal ions in the case of XendoU 
(Bhardwaj et al., 2006; Gioia et al., 2005). 

Structural information has been obtained by X-ray crystallography and 
cryoelectron microscopy studies for several CoV and cellular endoU homo- 
logs (Bhardwaj et al., 2008; Renzi et al., 2006; Ricagno et al., 2006; Xu 
et al., 2006). SARS-CoV and MHV nsp1i5s were shown to form 
homohexamers comprised of a dimer of trimers. The nsp15 monomers have 
an +f structure with three subdomains, a small N-terminal, an 
intermediate-sized middle, and a large C-terminal domain, the latter basi- 
cally representing the “conserved domain” of the CoV-like superfamily 
(see earlier). One side of this domain contains two B-sheets that line the pos- 
itively charged active-site groove, while the other side is formed by five 
a-helices. The structures of the catalytic domains are largely conserved 
among CoV endoUs and XendoU, supporting the proposed common phy- 
logeny of viral cellular members of this large endoribonuclease family 
(Bhardwaj et al., 2008; Renzi et al., 2006; Snijder et al., 2003). 
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The endoU hexamer reported for SARS-CoV nsp15 (Ricagno et al., 
2006) has dimensions of 80-96 x 110 A, forming a three-petal-shaped sur- 
face that surrounds a small, predominantly negatively charged central chan- 
nel with an inner diameter of ~15 A. The hexamer has six-independent 
active sites located on the surface of the molecule. Interactions between 
individual protomers are predominantly mediated by residues located in 
the N-terminal and middle domains. Database searches failed to identify 
closely related structural homologs, suggesting that endoUs diverged pro- 
foundly from other ribonucleases (Renzi et al., 2006; Ricagno et al., 
2006). Nevertheless, the presumed endoU catalytic residues (His/His/ 
Lys, Fig. 8) could be superimposed with the catalytic His/His/Lys residues 
of bovine RNase A, the prototype of a large superfamily of pyrimidine- 
specific ribonucleases (Ricagno et al., 2006; Ulferts and Ziebuhr, 2011). 
The superposition also includes a number of conserved substrate-binding 
residues. A comparison of viral and cellular endoU structures with that of 
RNase A and related nucleases using the PDBefold server revealed similar- 
ities between the structural cores of these enzymes that may be described as 
“interrelated by topological permutation,” providing initial evidence for a 
common ancestry of the two endonuclease families (Ulferts and Ziebuhr, 
2011). Further studies are required to substantiate this hypothesis (see later). 

The hexameric form is thought to be the fully active form of CoV nsp15. 
This is supported by the exponential increase of activity with increased pro- 
tein concentrations, the reduced activities determined for protein variants 
that do not multimerize and the increased RNA-binding activities observed 
for hexameric forms of endoU (Bhardwaj et al., 2006; Guarino et al., 2005; 
Xu et al., 2006). Consistent with this, hexamerization has been confirmed 
for different CoV endoU homologs and residues confirmed to be involved 
in intersubunit interactions are highly conserved among CoVs. 

In the structure of a truncated, monomeric form of SARS-CoV nsp15 
that lacks 28 N-terminal and 11 C-terminal residues, two loops of the cat- 
alytic domain were found to be displaced compared to their location in the 
hexamer, resulting in the destruction of the active site (Joseph et al., 2007). 
In hexameric structures, the two loops pack against each other and are sta- 
bilized by intermonomer interactions, suggesting that hexamerization may 
induce an allosteric switch. Furthermore, cross-linking and cryoelectron 
microscopy studies support a specific role of hexamerization in RNA bind- 
ing (Bhardwaj et al., 2006, 2008). 

In the structure model, the proposed active-site His and Lys residues 
(Fig. 8) identified by comparative sequence analysis and site-directed 
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mutagenesis (Bhardwaj et al., 2004; Gioia et al., 2005; Guarino et al., 2005; 
Ivanov et al., 2004a; Kang et al., 2007; Snijder et al., 2003) were found to be 
embedded in a positively charged groove of the catalytic domain. Other res- 
idues identified in the active site and proposed to be involved in binding to 
the substrate phosphate include the side chain ofa conserved Thr (Fig. 8) and 
the main chain amide of a conserved Gly (Fig. 1) (Bhardwaj et al., 2008; 
Ricagno et al., 2006; Xu et al., 2006). The proposed functional role of 
the latter residues has also been corroborated by mutagenesis data for several 
endoU homologs (Kang et al., 2007; Renzi et al., 2006; Ricagno et al., 
2006). 

As mentioned earlier, endoU and RNase A share a number of features in 
their active sites. In RNase A, pyrimidine binding primarily involves Thr-45 
and Phe-120. While Thr-45 forms hydrogen bonds with the pyrimidine 
base, Phe-120 interacts with the base through stacking interactions 
(Raines, 1998). The Ser-293/Tyr-342 residues of SARS-CoV nsp15 and 
the Thr-45/Phe-120 residues of RNase A occupy similar positions in the 
active-site clefts of the two enzymes (Ulferts and Ziebuhr, 2011). The 
Ser/Thr residue is conserved in viral and cellular domains, while Tyr-342 
is conserved in viral endoU homologs while conservative substitutions 
(Phe, Trp) are occasionally found in cellular endoU homologs (Renzi 
et al., 2006; Ricagno et al., 2006). The role of SARS-CoV endoU Ser- 
293 and Tyr-342 (and equivalent residues in related enzymes) in substrate 
binding received strong support by molecular modeling and site-directed 
mutagenesis data, with the conserved Ser/Thr residue being confirmed to 
have a critical role in the differential cleavage of uridine- and cytidine- 
containing substrates, respectively (Bhardwaj et al., 2008; Nedialkova 
et al., 2009; Ricagno et al., 2006). 

Similarities in their active-site structures and reaction products 
containing 2’,3/-cyclic phosphate ends suggest that endoUs and R Nase A-like 
endoribonucleases employ similar catalytic mechanisms (Nedialkova et al., 
2009; Ricagno et al., 2006). For RNase A, it has been established that two 
His residues in the active site act as general base and acid, respectively. The 
His residue that acts as a general base attracts a hydrogen from the ribose 
2'-hydroxyl group that subsequently attacks the 5’ P-O bond. The second 
His donates a hydrogen to the 5’-O, thus facilitating displacement of this 
group and subsequent product release (Raines, 1998). In a second step, the 
2',3'-cyclic phosphate is hydrolyzed, resulting in a 3'-phosphomonoester 
product and recovery of the enzyme. The latter is essentially a reverse reaction 
of the transphosphorylation reaction, with the protonated His now acting as 
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an acid and the other His acting as a base. In both reaction steps, Lys interacts 
with the phosphate to stabilize the pentavalent reaction intermediate. 

Although the reaction mechanisms employed by endoUs have not been 
studied in detail, the enzymes are thought to use an RNase A-like catalytic 
mechanism. This is supported by several lines of evidence, including (1) the 
conserved spatial positions in the structure and the critical functional role of 
two His and one Lys residue(s) (Ricagno et al., 2006), and (ii) the release of 
2',3'-cyclic phosphate-containing reaction products (Bhardwaj et al., 2004; 
Ivanov et al., 2004a) that are converted to products with 3’-hydroxyl ends 
(Nedialkova et al., 2009). 


8. SUMMARY AND FUTURE PERSPECTIVES 


Since the first in-depth analysis of a CoV replicase in 1989 
(Gorbalenya et al., 1989), significant progress has been made in terms of 
its structural and functional characterization. A multitude of enzymatic func- 
tions has been identified and characterized in vitro, although mainly using 
artificial substrates so far. Protein structures were obtained for most of the 
subunits from the nsp7—16 region (Neuman et al., 2014b), but unfortunately 
two prominent remaining “blank spots” on this map concern two key 
enzymes in CoV RNA synthesis, RdRp and helicase. Filling those gaps 
would constitute an important step forward, to address basic questions like 
the priming mechanism employed by the RdRp and the function of the 
NiRAN domain, and to accelerate targeted drug discovery, for example, 
in the area of nucleoside inhibitors of CoV RNA synthesis, which has 
received little attention thus far. Clearly, where CoV mRNA capping is 
concerned, identification of the elusive GTase remains a research priority 
(Subissi et al., 2014a). For other nsps, potential functions (nsp9, the N-ter- 
minal domain of nsp15) or substrates (nsp15-endoU) remain to be found. 

As highlighted by several nsp-wide interaction screening studies (Imbert 
et al., 2008; Pan et al., 2008; von Brunn et al., 2007) and more specifically by 
the in vitro data on the interplay between nsp7—8—12-14 (Subissi et al., 
2014b) and nsp10-14-16 (Bouvet et al., 2014), CoV nsps need to work 
together in many ways. The further characterization of the “nsp 
interactome,” now also inside the CoV-infected cell, will undoubtedly pro- 
vide more clues as to how specific functions are switched on and off or mod- 
ulated. Likewise, attention should be given to defining the interactions of 
CoV nsps with the specific RNA signals for genome replication 
(Madhugiri et al., 2014; Yang and Leibowitz, 2015), discontinuous minus- 


112 EJ. Snijder et al. 


strand RNA synthesis (attenuation at body TRSs, nascent minus-strand 
transfer, and reinitiation; Pasternak et al., 2006; Sawicki et al., 2007; Sola 
et al., 2011), and the transcription, capping, and polyadenylation of sg 
mRNAs (Fig. 1). These RNA sequences, several of which may be cis-acting, 
could provide starting points for improved biochemical assays, ultimately 
paving the way for the complete in vitro reconstitution of some of these 
multi-nsp-driven processes. 

The analysis of CoV RTC structure and function in the living infected 
cell remains an enormous technical challenge, requiring continued toolbox 
development. It is likely that several functional riddles can only be solved by 
studying infected cells. For example, the endoribonuclease activity of the 
nsp15 endoU domain, a potential “suicide enzyme” for an RNA virus, must 
be controlled tightly in the infected cell. Whereas the enzyme is highly 
active and displays only very limited substrate specificity in vitro (Ivanov 
et al., 2004a; Nedialkova et al., 2009), it may be confined to a specific com- 
partment in the infected cell and/or its activity may be modulated by inter- 
actions with other nsps or host factors. Such differences between in vitro and 
in vivo activities will surely emerge for other nsps as well, and they may be 
better understood following the further characterization (including their 
lipid composition) of the membranous replication organelles with which 
the metabolically active CoV RTC presumably is associated (Hagemetjer 
et al., 2012; Neuman et al., 2014a; van der Hoeven et al., 2016). These stud- 
ies should also answer the question of how both nsps and viral RNA sub- 
strates are targeted to or recruited by the membrane-bound CoV RTC, 
in particular also during the earlier stages of infection when viral RNA syn- 
thesis appears to be taking off in the absence of the prominent membrane 
rearrangements observed later in infection. 

Specific mutations in CoV genomes can now be reverse engineered, but 
many of the functions encoded by nsp7—nsp12 are so basic that their inac- 
tivation will merely result in dead virus mutants that do not provide many 
deeper insights into nsp function. This in part explains why most progress 
thus far has been made for some of the functions that can—fortunately— 
be inactivated without such lethal consequences, like the nsp14 ExoN 
and nsp16 2/-O-MTase enzymes (Eckerle et al., 2007, 2010; Graham 
et al., 2012; Menachery et al., 2014; Smith et al., 2013; Zust et al., 
2011). For this reason, the field should continue to also employ 
“traditional” (forward) genetic methods to characterize (and produce more) 
conditionally defective CoVs, like temperature-sensitive mutants (Sawicki 
et al., 2005). Thanks to the advent of next-generation sequencing 
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technologies, tracing the evolution of crippled virus mutants and (pseudo) 
revertants has become much more straightforward than before, and this 
approach (letting the virus do the work) likely is among the most economical 
ones in uncovering previously unknown interactions between the protein 
and RNA players in CoV replication (Zust et al., 2008). Furthermore, it 
may be possible to develop cell-based assays for the analysis of CoV nsp func- 
tions that do not rely on having a replication-competent virus to start with. 

Unraveling the molecular mechanisms underlying the presumed mis- 
match excision function (Bouvet et al., 2012) of the nsp14-ExoN, which 
is uniquely encoded by RNA virus genomes larger than 20 kb (Nga 
et al., 2011), connects to the mechanisms driving the evolution of 
nidoviruses at large (Lauber et al., 2013). Also, replicases from other 
nidovirus branches will need to be studied to fully understand the basic prin- 
ciples governing the profound divergence and genome expansion of this 
exceptional order of +RNA viruses. The error rate and genomic plasticity 
of RNA viruses are among their most fascinating features, and also form the 
basis for the many problems caused by RNA virus mutation and adaptation, 
including successful zoonotic transfer. As exemplified by the viable CoV 
mutants lacking the ExoN or 2’/-O-MTase functions (Graham et al., 
2012; Habjan et al., 2013; Menachery et al., 2014; Zust et al., 2011), the 
functional characterization of CoV replicative enzymes can be key to the 
development of conceptually new live attenuated vaccine prototypes. Like- 
wise, it will contribute to the development of broad-spectrum and highly 
effective antiviral drugs targeting essential enzyme functions, critical inter- 
actions with nsp cofactors, or “nonessential” nsp functions that promote effi- 
cient viral replication and/or pathogenesis. As highlighted by the SARS and 
MERS outbreaks of the past 15 years, having such compounds available 
would definitely strengthen our first line of defense against CoV infections 
in humans. 
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